Search | arXiv e-print repository

arXiv:2407.19346 [pdf, other]

Polynomial Regression as a Task for Understanding In-context Learning Through Finetuning and Alignment

Authors: Max Wilcoxson, Morten Svendgård, Ria Doshi, Dylan Davis, Reya Vir, Anant Sahai

Abstract: Simple function classes have emerged as toy problems to better understand in-context-learning in transformer-based architectures used for large language models. But previously proposed simple function classes like linear regression or multi-layer-perceptrons lack the structure required to explore things like prompting and alignment within models capable of in-context-learning. We propose univariat… ▽ More Simple function classes have emerged as toy problems to better understand in-context-learning in transformer-based architectures used for large language models. But previously proposed simple function classes like linear regression or multi-layer-perceptrons lack the structure required to explore things like prompting and alignment within models capable of in-context-learning. We propose univariate polynomial regression as a function class that is just rich enough to study prompting and alignment, while allowing us to visualize and understand what is going on clearly. △ Less

Submitted 27 July, 2024; originally announced July 2024.

Comments: ICML Workshop on In-Context Learning

arXiv:2406.12680 [pdf, other]

Measuring Psychological Depth in Language Models

Authors: Fabrice Harel-Canada, Hanyu Zhou, Sreya Mupalla, Zeynep Yildiz, Amit Sahai, Nanyun Peng

Abstract: Evaluations of creative stories generated by large language models (LLMs) often focus on objective properties of the text, such as its style, coherence, and toxicity. While these metrics are indispensable, they do not speak to a story's subjective, psychological impact from a reader's perspective. We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that me… ▽ More Evaluations of creative stories generated by large language models (LLMs) often focus on objective properties of the text, such as its style, coherence, and toxicity. While these metrics are indispensable, they do not speak to a story's subjective, psychological impact from a reader's perspective. We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that measures an LLM's ability to produce authentic and narratively complex stories that provoke emotion, empathy, and engagement. We empirically validate our framework by showing that humans can consistently evaluate stories based on PDS (0.72 Krippendorff's alpha). We also explore techniques for automating the PDS to easily scale future analyses. GPT-4o, combined with a novel Mixture-of-Personas (MoP) prompting strategy, achieves an average Spearman correlation of $0.51$ with human judgment while Llama-3-70B scores as high as 0.68 for empathy. Finally, we compared the depth of stories authored by both humans and LLMs. Surprisingly, GPT-4 stories either surpassed or were statistically indistinguishable from highly-rated human-written stories sourced from Reddit. By shifting the focus from text to reader, the Psychological Depth Scale is a validated, automated, and systematic means of measuring the capacity of LLMs to connect with humans through the stories they tell. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Preprint. Under Review

arXiv:2306.13255 [pdf, other]

Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models

Authors: David X. Wu, Anant Sahai

Abstract: We study the asymptotic generalization of an overparameterized linear model for multiclass classification under the Gaussian covariates bi-level model introduced in Subramanian et al.~'22, where the number of data points, features, and classes all grow together. We fully resolve the conjecture posed in Subramanian et al.~'22, matching the predicted regimes for generalization. Furthermore, our new… ▽ More We study the asymptotic generalization of an overparameterized linear model for multiclass classification under the Gaussian covariates bi-level model introduced in Subramanian et al.~'22, where the number of data points, features, and classes all grow together. We fully resolve the conjecture posed in Subramanian et al.~'22, matching the predicted regimes for generalization. Furthermore, our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1 asymptotically. One surprising consequence of our tight results is that the min-norm interpolating classifier can be asymptotically suboptimal relative to noninterpolating classifiers in the regime where the min-norm interpolating regressor is known to be optimal. The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels. As an application, we show that the same type of analysis can be used to analyze the related multilabel classification problem under the same bi-level ensemble. △ Less

Submitted 5 December, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023, 56 pages

arXiv:2206.01399 [pdf, other]

Generalization for multiclass classification with overparameterized linear models

Authors: Vignesh Subramanian, Rahul Arya, Anant Sahai

Abstract: Via an overparameterized linear model with Gaussian features, we provide conditions for good generalization for multiclass classification of minimum-norm interpolating solutions in an asymptotic setting where both the number of underlying features and the number of classes scale with the number of training points. The survival/contamination analysis framework for understanding the behavior of over… ▽ More Via an overparameterized linear model with Gaussian features, we provide conditions for good generalization for multiclass classification of minimum-norm interpolating solutions in an asymptotic setting where both the number of underlying features and the number of classes scale with the number of training points. The survival/contamination analysis framework for understanding the behavior of overparameterized learning problems is adapted to this setting, revealing that multiclass classification qualitatively behaves like binary classification in that, as long as there are not too many classes (made precise in the paper), it is possible to generalize well even in some settings where the corresponding regression tasks would not generalize. Besides various technical challenges, it turns out that the key difference from the binary classification setting is that there are relatively fewer positive training examples of each class in the multiclass setting as the number of classes increases, making the multiclass problem "harder" than the binary one. △ Less

Submitted 3 June, 2022; originally announced June 2022.

Comments: 44 pages, 4 figures

arXiv:2109.13215 [pdf, other]

Classification and Adversarial examples in an Overparameterized Linear Model: A Signal Processing Perspective

Authors: Adhyyan Narang, Vidya Muthukumar, Anant Sahai

Abstract: State-of-the-art deep learning classifiers are heavily overparameterized with respect to the amount of training examples and observed to generalize well on "clean" data, but be highly susceptible to infinitesmal adversarial perturbations. In this paper, we identify an overparameterized linear ensemble, that uses the "lifted" Fourier feature map, that demonstrates both of these behaviors. The input… ▽ More State-of-the-art deep learning classifiers are heavily overparameterized with respect to the amount of training examples and observed to generalize well on "clean" data, but be highly susceptible to infinitesmal adversarial perturbations. In this paper, we identify an overparameterized linear ensemble, that uses the "lifted" Fourier feature map, that demonstrates both of these behaviors. The input is one-dimensional, and the adversary is only allowed to perturb these inputs and not the non-linear features directly. We find that the learned model is susceptible to adversaries in an intermediate regime where classification generalizes but regression does not. Notably, the susceptibility arises despite the absence of model mis-specification or label noise, which are commonly cited reasons for adversarial-susceptibility. These results are extended theoretically to a random-Fourier-sum setup that exhibits double-descent behavior. In both feature-setups, the adversarial vulnerability arises because of a phenomenon we term spatial localization: the predictions of the learned model are markedly more sensitive in the vicinity of training points than elsewhere. This sensitivity is a consequence of feature lifting and is reminiscent of Gibb's and Runge's phenomena from signal processing and functional analysis. Despite the adversarial susceptibility, we find that classification with these features can be easier than the more commonly studied "independent feature" models. △ Less

Submitted 27 September, 2021; originally announced September 2021.

Comments: 32 pages, 10 figures

arXiv:2012.02125 [pdf, other]

On the Impossibility of Convergence of Mixed Strategies with No Regret Learning

Authors: Vidya Muthukumar, Soham Phade, Anant Sahai

Abstract: We study the limiting behavior of the mixed strategies that result from optimal no-regret learning strategies in a repeated game setting where the stage game is any 2 by 2 competitive game. We consider optimal no-regret algorithms that are mean-based and monotonic in their argument. We show that for any such algorithm, the limiting mixed strategies of the players cannot converge almost surely to a… ▽ More We study the limiting behavior of the mixed strategies that result from optimal no-regret learning strategies in a repeated game setting where the stage game is any 2 by 2 competitive game. We consider optimal no-regret algorithms that are mean-based and monotonic in their argument. We show that for any such algorithm, the limiting mixed strategies of the players cannot converge almost surely to any Nash equilibrium. This negative result is also shown to hold under a broad relaxation of these assumptions, including popular variants of Online-Mirror-Descent with optimism and/or adaptive step-sizes. Finally, we conjecture that the monotonicity assumption can be removed, and provide partial evidence for this conjecture. Our results identify the inherent stochasticity in players' realizations as a critical factor underlying this divergence in outcomes between using the opponent's mixtures and realizations to make updates. △ Less

Submitted 2 March, 2022; v1 submitted 3 December, 2020; originally announced December 2020.

Comments: 47 pages, 12 figures

arXiv:2008.09317 [pdf, ps, other]

Indistinguishability Obfuscation from Well-Founded Assumptions

Authors: Aayush Jain, Huijia Lin, Amit Sahai

Abstract: In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four well-founded assumptions. We prove: Let $τ\in (0,\infty), δ\in (0,1), ε\in (0,1)$ be arbitrary constants. Assume sub-exponential security of the following assumptions, where $λ$ is a security parameter, and the parameters $\ell,k,n$ below are large enough polynomials in $λ$: - The SXDH… ▽ More In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four well-founded assumptions. We prove: Let $τ\in (0,\infty), δ\in (0,1), ε\in (0,1)$ be arbitrary constants. Assume sub-exponential security of the following assumptions, where $λ$ is a security parameter, and the parameters $\ell,k,n$ below are large enough polynomials in $λ$: - The SXDH assumption on asymmetric bilinear groups of a prime order $p = O(2^λ)$, - The LWE assumption over $\mathbb{Z}_{p}$ with subexponential modulus-to-noise ratio $2^{k^ε}$, where $k$ is the dimension of the LWE secret, - The LPN assumption over $\mathbb{Z}_p$ with polynomially many LPN samples and error rate $1/\ell^δ$, where $\ell$ is the dimension of the LPN secret, - The existence of a Boolean PRG in $\mathsf{NC}^0$ with stretch $n^{1+τ}$, Then, (subexponentially secure) indistinguishability obfuscation for all polynomial-size circuits exists. △ Less

Submitted 21 August, 2020; originally announced August 2020.

arXiv:2005.08054 [pdf, other]

Classification vs regression in overparameterized regimes: Does the loss function matter?

Authors: Vidya Muthukumar, Adhyyan Narang, Vignesh Subramanian, Mikhail Belkin, Daniel Hsu, Anant Sahai

Abstract: We compare classification and regression tasks in an overparameterized linear model with Gaussian features. On the one hand, we show that with sufficient overparameterization all training points are support vectors: solutions obtained by least-squares minimum-norm interpolation, typically used for regression, are identical to those produced by the hard-margin support vector machine (SVM) that mini… ▽ More We compare classification and regression tasks in an overparameterized linear model with Gaussian features. On the one hand, we show that with sufficient overparameterization all training points are support vectors: solutions obtained by least-squares minimum-norm interpolation, typically used for regression, are identical to those produced by the hard-margin support vector machine (SVM) that minimizes the hinge loss, typically used for training classifiers. On the other hand, we show that there exist regimes where these interpolating solutions generalize well when evaluated by the 0-1 test loss function, but do not generalize if evaluated by the square loss function, i.e. they approach the null risk. Our results demonstrate the very different roles and properties of loss functions used at the training phase (optimization) and the testing phase (generalization). △ Less

Submitted 14 October, 2021; v1 submitted 16 May, 2020; originally announced May 2020.

Journal ref: Journal of Machine Learning Research, 22(222):1-69, 2021

arXiv:1910.09630 [pdf, other]

doi 10.1109/ACCESS.2020.2984218

Blind interactive learning of modulation schemes: Multi-agent cooperation without co-design

Authors: Anant Sahai, Joshua Sanz, Vignesh Subramanian, Caryn Tran, Kailas Vodrahalli

Abstract: We examine the problem of learning to cooperate in the context of wireless communication. In our setting, two agents must learn modulation schemes that enable them to communicate across a power-constrained additive white Gaussian noise channel. We investigate whether learning is possible under different levels of information sharing between distributed agents which are not necessarily co-designed.… ▽ More We examine the problem of learning to cooperate in the context of wireless communication. In our setting, two agents must learn modulation schemes that enable them to communicate across a power-constrained additive white Gaussian noise channel. We investigate whether learning is possible under different levels of information sharing between distributed agents which are not necessarily co-designed. We employ the "Echo" protocol, a "blind" interactive learning protocol where an agent hears, understands, and repeats (echoes) back the message received from another agent, simultaneously training itself to communicate. To capture the idea of cooperation between "not necessarily co-designed" agents we use two different populations of function approximators - neural networks and polynomials. We also include interactions between learning agents and non-learning agents with fixed modulation protocols such as QPSK and 16QAM. We verify the universality of the Echo learning approach, showing it succeeds independent of the inner workings of the agents. In addition to matching the communication expectations of others, we show that two learning agents can collaboratively invent a successful communication approach from independent random initializations. We complement our simulations with an implementation of the Echo protocol in software-defined radios. To explore the continuum of co-design, we study how learning is impacted by different levels of information sharing between agents, including sharing training symbols, losses, and full gradients. We find that co-design (increased information sharing) accelerates learning. Learning higher order modulation schemes is a more difficult task, and the beneficial effect of co-design becomes more pronounced as the task becomes harder. △ Less

Submitted 1 April, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

Comments: 33 pages, 25 figures, code can be found at https://github.com/ml4wireless/echo, accepted for publication in IEEE Access

arXiv:1905.11555 [pdf, other]

Robust Commitments and Partial Reputation

Authors: Vidya Muthukumar, Anant Sahai

Abstract: Agents rarely act in isolation -- their behavioral history, in particular, is public to others. We seek a non-asymptotic understanding of how a leader agent should shape this history to its maximal advantage, knowing that follower agent(s) will be learning and responding to it. We study Stackelberg leader-follower games with finite observations of the leader commitment, which commonly models secur… ▽ More Agents rarely act in isolation -- their behavioral history, in particular, is public to others. We seek a non-asymptotic understanding of how a leader agent should shape this history to its maximal advantage, knowing that follower agent(s) will be learning and responding to it. We study Stackelberg leader-follower games with finite observations of the leader commitment, which commonly models security games and network routing in engineering, and persuasion mechanisms in economics. First, we formally show that when the game is not zero-sum and the vanilla Stackelberg commitment is mixed, it is not robust to observational uncertainty. We propose observation-robust, polynomial-time-computable commitment constructions for leader strategies that approximate the Stackelberg payoff, and also show that these commitment rules approximate the maximum obtainable payoff (which could in general be greater than the Stackelberg payoff). △ Less

Submitted 27 May, 2019; originally announced May 2019.

Comments: 29 pages, extended abstract at ACM Economics and Computation 2019

arXiv:1904.09252 [pdf, ps, other]

Learning Physical-Layer Communication with Quantized Feedback

Authors: Jinxiang Song, Bile Peng, Christian Häger, Henk Wymeersch, Anant Sahai

Abstract: Data-driven optimization of transmitters and receivers can reveal new modulation and detection schemes and enable physical-layer communication over unknown channels. Previous work has shown that practical implementations of this approach require a feedback signal from the receiver to the transmitter. In this paper, we study the impact of quantized feedback in data-driven learning of physical-layer… ▽ More Data-driven optimization of transmitters and receivers can reveal new modulation and detection schemes and enable physical-layer communication over unknown channels. Previous work has shown that practical implementations of this approach require a feedback signal from the receiver to the transmitter. In this paper, we study the impact of quantized feedback in data-driven learning of physical-layer communication. A novel quantization method is proposed, which exploits the specific properties of the feedback signal and is suitable for non-stationary signal distributions. The method is evaluated for linear and nonlinear channels. Simulation results show that feedback quantization does not appreciably affect the learning process and can lead to excellent performance, even with $1$-bit quantization. In addition, it is shown that learning is surprisingly robust to noisy feedback where random bit flips are applied to the quantization bits. △ Less

Submitted 4 November, 2019; v1 submitted 19 April, 2019; originally announced April 2019.

arXiv:1903.09139 [pdf, other]

Harmless interpolation of noisy data in regression

Authors: Vidya Muthukumar, Kailas Vodrahalli, Vignesh Subramanian, Anant Sahai

Abstract: A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We investigate this overparameterized regime in linear regression, where all solutions that minimize training error interpolate the data, including noise. We char… ▽ More A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We investigate this overparameterized regime in linear regression, where all solutions that minimize training error interpolate the data, including noise. We characterize the fundamental generalization (mean-squared) error of any interpolating solution in the presence of noise, and show that this error decays to zero with the number of features. Thus, overparameterization can be explicitly beneficial in ensuring harmless interpolation of noise. We discuss two root causes for poor generalization that are complementary in nature -- signal "bleeding" into a large number of alias features, and overfitting of noise by parsimonious feature selectors. For the sparse linear model with noise, we provide a hybrid interpolating scheme that mitigates both these issues and achieves order-optimal MSE over all possible interpolating solutions. △ Less

Submitted 9 September, 2019; v1 submitted 21 March, 2019; originally announced March 2019.

Comments: 52 pages, expanded version of the paper presented at ITA in San Diego in Feb 2019, ISIT in Paris in July 2019, at Simons in July, and as a plenary at ITW in Visby in August 2019

arXiv:1901.05061 [pdf, other]

Spectrogram Feature Losses for Music Source Separation

Authors: Abhimanyu Sahai, Romann Weber, Brian McWilliams

Abstract: In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a high-level feature loss term, extracted from the spectrograms using a VGG net, can improve separation quality vis-a-vis a pure pixel-level loss. We show this improvement… ▽ More In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a high-level feature loss term, extracted from the spectrograms using a VGG net, can improve separation quality vis-a-vis a pure pixel-level loss. We show this improvement in the context of the MMDenseNet, a State-of-the-Art deep learning model for this task, for the extraction of drums and vocal sounds from songs in the musdb18 database, covering a broad range of western music genres. We believe that this finding can be generalized and applied to broader machine learning-based systems in the audio domain. △ Less

Submitted 26 June, 2019; v1 submitted 15 January, 2019; originally announced January 2019.

Comments: Accepted for presentation at the 27th European Signal Processing Conference (EUSIPCO 2019)

MSC Class: 62; 68 ACM Class: I.2.6; H.5.5

arXiv:1810.00106 [pdf, ps, other]

Expander Graphs are Non-Malleable Codes

Authors: Peter M. R. Rasmussen, Amit Sahai

Abstract: Any $d$-regular graph on $n$ vertices with spectral expansion $λ$ satisfying $n = Ω(d^3\log(d)/λ)$ yields a $O\left(\frac{λ^{3/2}}{d}\right)$-non-malleable code for single-bit messages in the split-state model. Any $d$-regular graph on $n$ vertices with spectral expansion $λ$ satisfying $n = Ω(d^3\log(d)/λ)$ yields a $O\left(\frac{λ^{3/2}}{d}\right)$-non-malleable code for single-bit messages in the split-state model. △ Less

Submitted 20 March, 2019; v1 submitted 28 September, 2018; originally announced October 2018.

Comments: 10 pages Resubmitted with revised introduction and acknowledgement

arXiv:1806.08777 [pdf, other]

Wireless Channel Dynamics and Robustness for Ultra-Reliable Low-Latency Communications

Authors: Vasuki Narasimha Swamy, Paul Rigge, Gireeja Ranade, Borivoje Nikolic, Anant Sahai

Abstract: Interactive, immersive and critical applications demand ultra-reliable low-latency communication (URLLC). To build wireless communication systems that can support these applications, understanding the characteristics of the wireless medium is paramount. Although wireless channel characteristics and dynamics have been extensively studied, it is important to revisit these concepts in the context of… ▽ More Interactive, immersive and critical applications demand ultra-reliable low-latency communication (URLLC). To build wireless communication systems that can support these applications, understanding the characteristics of the wireless medium is paramount. Although wireless channel characteristics and dynamics have been extensively studied, it is important to revisit these concepts in the context of the strict demands of low latency and ultra-reliability. In this paper, we bring a modeling approach from robust control to wireless communication -- the wireless channel characteristics are given a nominal model around which we allow for some quantified uncertainty. We propose certain key "directions" along which to bound model uncertainty that are relevant to URLLC. For the nominal model, we take an in-depth look at wireless channel characteristics such as spatial and temporal correlations based on Jakes' model. Contrary to what has been claimed in the literature, we find that standard Rayleigh fading processes are not bandlimited. This has significant implications on the predictability of channels. We also find that under reasonable conditions the spatial correlation of channels provide a fading distribution that is not too far off from an independent spatial fading model. Additionally, we look at the impact of these channel models on cooperative communication based systems. We find that while spatial-diversity-based techniques are necessary to combat the adverse effects of fading, time-diversity-based techniques are necessary to be robust against unmodeled errors. Robust URLLC systems need to operate with both an SNR margin and a time/repetition margin. △ Less

Submitted 22 June, 2018; originally announced June 2018.

Comments: Submitted to IEEE JSAC Special Issue on Ultra-Reliable Low-Latency Communications in Wireless Networks

arXiv:1805.08562 [pdf, other]

Best of many worlds: Robust model selection for online supervised learning

Authors: Vidya Muthukumar, Mitas Ray, Anant Sahai, Peter L. Bartlett

Abstract: We introduce algorithms for online, full-information prediction that are competitive with contextual tree experts of unknown complexity, in both probabilistic and adversarial settings. We show that by incorporating a probabilistic framework of structural risk minimization into existing adaptive algorithms, we can robustly learn not only the presence of stochastic structure when it exists (leading… ▽ More We introduce algorithms for online, full-information prediction that are competitive with contextual tree experts of unknown complexity, in both probabilistic and adversarial settings. We show that by incorporating a probabilistic framework of structural risk minimization into existing adaptive algorithms, we can robustly learn not only the presence of stochastic structure when it exists (leading to constant as opposed to $\mathcal{O}(\sqrt{T})$ regret), but also the correct model order. We thus obtain regret bounds that are competitive with the regret of an optimal algorithm that possesses strong side information about both the complexity of the optimal contextual tree expert and whether the process generating the data is stochastic or adversarial. These are the first constructive guarantees on simultaneous adaptivity to the model and the presence of stochasticity. △ Less

Submitted 22 May, 2018; originally announced May 2018.

Comments: 33 pages, 5 figures

arXiv:1803.05143 [pdf, other]

Network Coding for Real-time Wireless Communication for Automation

Authors: Vasuki Narasimha Swamy, Paul Rigge, Gireeja Ranade, Anant Sahai, Borivoje Nikolic

Abstract: Real-time applications require latencies on the order of a millisecond with very high reliabilities, paralleling the requirements for high-performance industrial control. Current wireless technologies like WiFi, Bluetooth, LTE, etc. are unable to meet these stringent latency and reliability requirements, forcing the use of wired systems. This paper introduces a wireless communication protocol base… ▽ More Real-time applications require latencies on the order of a millisecond with very high reliabilities, paralleling the requirements for high-performance industrial control. Current wireless technologies like WiFi, Bluetooth, LTE, etc. are unable to meet these stringent latency and reliability requirements, forcing the use of wired systems. This paper introduces a wireless communication protocol based on network coding that in conjunction with cooperative communication techniques builds the necessary diversity to achieve the target reliability. The proposed protocol is analyzed using a communication theoretic delay-limited-capacity framework and compared to proposed protocols without network coding. The results show that for larger network sizes or payloads employing network coding lowers the minimum SNR required to achieve the target reliability. For a scenario inspired by an industrial printing application with $30$ nodes in the control loop, aggregate throughput of $4.8$ Mb/s, $20$MHz of bandwidth and cycle time under $2$ ms, the protocol can robustly achieve a system probability of error better than $10^{-9}$ with a nominal SNR less than $2$ dB under ideal channel conditions. △ Less

Submitted 14 March, 2018; originally announced March 2018.

Comments: A preliminary version of this work appeared at IEEE WCNC 2016

arXiv:1801.04541 [pdf, other]

Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication

Authors: Colin de Vrieze, Shane Barratt, Daniel Tsai, Anant Sahai

Abstract: Traditional radio systems are strictly co-designed on the lower levels of the OSI stack for compatibility and efficiency. Although this has enabled the success of radio communications, it has also introduced lengthy standardization processes and imposed static allocation of the radio spectrum. Various initiatives have been undertaken by the research community to tackle the problem of artificial sp… ▽ More Traditional radio systems are strictly co-designed on the lower levels of the OSI stack for compatibility and efficiency. Although this has enabled the success of radio communications, it has also introduced lengthy standardization processes and imposed static allocation of the radio spectrum. Various initiatives have been undertaken by the research community to tackle the problem of artificial spectrum scarcity by both making frequency allocation more dynamic and building flexible radios to replace the static ones. There is reason to believe that just as computer vision and control have been overhauled by the introduction of machine learning, wireless communication can also be improved by utilizing similar techniques to increase the flexibility of wireless networks. In this work, we pose the problem of discovering low-level wireless communication schemes ex-nihilo between two agents in a fully decentralized fashion as a reinforcement learning problem. Our proposed approach uses policy gradients to learn an optimal bi-directional communication scheme and shows surprisingly sophisticated and intelligent learning behavior. We present the results of extensive experiments and an analysis of the fidelity of our approach. △ Less

Submitted 14 January, 2018; originally announced January 2018.

arXiv:1703.05348 [pdf, ps, other]

Layered black-box, behavioral interconnection perspective and applications to the problem of communication with fidelity criteria, Part II: stationary sources satisfying ψ-mixing criterion

Authors: Mukul Agarwal, Sanjoy Mitter, Anant Sahai

Abstract: Theorems from Part 1 of this paper are generalized to ψ-mixing sources in this paper. Application to Markoff chains and order m Markoff chains is presented. The main result is the generalization of Theorem 1 in Part 1. Theorems from Part 1 of this paper are generalized to ψ-mixing sources in this paper. Application to Markoff chains and order m Markoff chains is presented. The main result is the generalization of Theorem 1 in Part 1. △ Less

Submitted 23 March, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

arXiv:1703.05346 [pdf, ps, other]

Layered black-box, behavioral interconnection perspective and applications to the problem of communication with fidelity criteria, Part I: i.i.d. sources

Authors: Mukul Agarwal, Sanjoy Mitter, Anant Sahai

Abstract: In this paper, the problem of communication over an essentially unknown channel, which is known to be able to communicate a source to a destination to within a certain distortion level, is considered from a behavioral, interconnection view-point. Rates of reliable communication are derived and source-channel separation for communication with fidelity criteria is proved. The results are then genera… ▽ More In this paper, the problem of communication over an essentially unknown channel, which is known to be able to communicate a source to a destination to within a certain distortion level, is considered from a behavioral, interconnection view-point. Rates of reliable communication are derived and source-channel separation for communication with fidelity criteria is proved. The results are then generalized to the multi-user setting under certain assumptions. Other applications of this problem problem which follow from this perspective are discussed. △ Less

Submitted 26 March, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

arXiv:1701.04187 [pdf, other]

Control Capacity

Authors: Gireeja Ranade, Anant Sahai

Abstract: Feedback control actively dissipates uncertainty from a dynamical system by means of actuation. We develop a notion of "control capacity" that gives a fundamental limit (in bits) on the rate at which a controller can dissipate the uncertainty from a system, i.e. stabilize to a known fixed point. We give a computable single-letter characterization of control capacity for memoryless stationary scala… ▽ More Feedback control actively dissipates uncertainty from a dynamical system by means of actuation. We develop a notion of "control capacity" that gives a fundamental limit (in bits) on the rate at which a controller can dissipate the uncertainty from a system, i.e. stabilize to a known fixed point. We give a computable single-letter characterization of control capacity for memoryless stationary scalar multiplicative actuation channels. Control capacity allows us to answer questions of stabilizability for scalar linear systems: a system with actuation uncertainty is stabilizable if and only if the control capacity is larger than the log of the unstable open-loop eigenvalue. For second-moment senses of stability, we recover the classic uncertainty threshold principle result. However, our definition of control capacity can quantify the stabilizability limits for any moment of stability. Our formulation parallels the notion of Shannon's communication capacity, and thus yields both a strong converse and a way to compute the value of side-information in control. The results in our paper are motivated by bit-level models for control that build on the deterministic models that are widely used to understand information flows in wireless network information theory. △ Less

Submitted 16 January, 2017; originally announced January 2017.

Comments: 52 pages

arXiv:1609.02968 [pdf, other]

Real-time Cooperative Communication for Automation over Wireless

Authors: Vasuki Narasimha Swamy, Sahaana Suri, Paul Rigge, Matthew Weiner, Gireeja Ranade, Anant Sahai, Borivoje Nikolic

Abstract: High-performance industrial automation systems rely on tens of simultaneously active sensors and actuators and have stringent communication latency and reliability requirements. Current wireless technologies like WiFi, Bluetooth, and LTE are unable to meet these requirements, forcing the use of wired communication in industrial control systems. This paper introduces a wireless communication protoc… ▽ More High-performance industrial automation systems rely on tens of simultaneously active sensors and actuators and have stringent communication latency and reliability requirements. Current wireless technologies like WiFi, Bluetooth, and LTE are unable to meet these requirements, forcing the use of wired communication in industrial control systems. This paper introduces a wireless communication protocol that capitalizes on multiuser diversity and cooperative communication to achieve the ultra-reliability with a low-latency constraint. Our protocol is analyzed using the communication-theoretic delay-limited-capacity framework and compared to baseline schemes that primarily exploit frequency diversity. For a scenario inspired by an industrial printing application with thirty nodes in the control loop, 20B messages transmitted between pairs of nodes and a cycle time of $2$ ms, an idealized protocol can achieve a cycle failure probability (probability that any packet in a cycle is not successfully delivered) lower than $10^{-9}$ with nominal SNR below 5 dB in a 20MHz wide channel. △ Less

Submitted 23 January, 2017; v1 submitted 9 September, 2016; originally announced September 2016.

Comments: A preliminary version of this work appeared at IEEE International Conference on Communications 2015

arXiv:1406.3726 [pdf, ps, other]

Evaluation of Machine Learning Techniques for Green Energy Prediction

Authors: Ankur Sahai

Abstract: We evaluate the following Machine Learning techniques for Green Energy (Wind, Solar) Prediction: Bayesian Inference, Neural Networks, Support Vector Machines, Clustering techniques (PCA). Our objective is to predict green energy using weather forecasts, predict deviations from forecast green energy, find correlation amongst different weather parameters and green energy availability, recover lost o… ▽ More We evaluate the following Machine Learning techniques for Green Energy (Wind, Solar) Prediction: Bayesian Inference, Neural Networks, Support Vector Machines, Clustering techniques (PCA). Our objective is to predict green energy using weather forecasts, predict deviations from forecast green energy, find correlation amongst different weather parameters and green energy availability, recover lost or missing energy (/ weather) data. We use historical weather data and weather forecasts for the same. △ Less

Submitted 14 June, 2014; originally announced June 2014.

arXiv:1402.6552 [pdf, other]

Renewable Energy Prediction using Weather Forecasts for Optimal Scheduling in HPC Systems

Authors: Ankur Sahai

Abstract: The objective of the GreenPAD project is to use green energy (wind, solar and biomass) for powering data-centers that are used to run HPC jobs. As a part of this it is important to predict the Renewable (Wind) energy for efficient scheduling (executing jobs that require higher energy when there is more green energy available and vice-versa). For predicting the wind energy we first analyze the hist… ▽ More The objective of the GreenPAD project is to use green energy (wind, solar and biomass) for powering data-centers that are used to run HPC jobs. As a part of this it is important to predict the Renewable (Wind) energy for efficient scheduling (executing jobs that require higher energy when there is more green energy available and vice-versa). For predicting the wind energy we first analyze the historical data to find a statistical model that gives relation between wind energy and weather attributes. Then we use this model based on the weather forecast data to predict the green energy availability in the future. Using the green energy prediction obtained from the statistical model we are able to precompute job schedules for maximizing the green energy utilization in the future. We propose a model which uses live weather data in addition to machine learning techniques (which can predict future deviations in weather conditions based on current deviations from the forecast) to make on-the-fly changes to the precomputed schedule (based on green energy prediction). For this we first analyze the data using histograms and simple statistical tools such as correlation. In addition we build (correlation) regression model for finding the relation between wind energy availability and weather attributes (temperature, cloud cover, air pressure, wind speed / direction, precipitation and sunshine). We also analyze different algorithms and machine learning techniques for optimizing the job schedules for maximizing the green energy utilization. △ Less

Submitted 26 February, 2014; originally announced February 2014.

arXiv:1402.5642 [pdf, other]

VM Power Prediction in Distributed Systems for Maximizing Renewable Energy Usage

Authors: Ankur Sahai

Abstract: In the context of GreenPAD project it is important to predict the energy consumption of individual (and mixture of) VMs / workload for optimal scheduling (running those VMs which require higher energy when there is more green energy available and vice-versa) in order to maximize green energy utilization. For this we execute the following experiments on an Openstack cloud testbed consisting of Fu… ▽ More In the context of GreenPAD project it is important to predict the energy consumption of individual (and mixture of) VMs / workload for optimal scheduling (running those VMs which require higher energy when there is more green energy available and vice-versa) in order to maximize green energy utilization. For this we execute the following experiments on an Openstack cloud testbed consisting of Fujitsu servers: VM energy measurement for different configurations (flavor + workload) and VM energy prediction for a new configuration. The automation framework for running these experiments uses bash scripts which call tools like 'stress' (simulating workloads), 'collected' (resource usage) and 'IPMI' (power measurement). We propose a linear model for predicting the power usage of the VMs based on regression. We first collect the resource usage (using collected) and the associated power usage (using IPMI) for different VM configurations and use this to build a (multi-) regression model (between resource usage and VM energy consumption). Then we use the information about the resource usage patterns of the new workload to predict the power usage. For predicting power for mix of workloads we execute (build a regression model based on) experiments with random workloads. We observe the highest energy usage for CPU-intensive workloads followed by memory-intensive workloads. △ Less

Submitted 23 February, 2014; originally announced February 2014.

arXiv:1312.4182 [pdf, ps, other]

Adaptive Protocols for Interactive Communication

Authors: Shweta Agrawal, Ran Gelles, Amit Sahai

Abstract: How much adversarial noise can protocols for interactive communication tolerate? This question was examined by Braverman and Rao (IEEE Trans. Inf. Theory, 2014) for the case of "robust" protocols, where each party sends messages only in fixed and predetermined rounds. We consider a new class of non-robust protocols for Interactive Communication, which we call adaptive protocols. Such protocols ada… ▽ More How much adversarial noise can protocols for interactive communication tolerate? This question was examined by Braverman and Rao (IEEE Trans. Inf. Theory, 2014) for the case of "robust" protocols, where each party sends messages only in fixed and predetermined rounds. We consider a new class of non-robust protocols for Interactive Communication, which we call adaptive protocols. Such protocols adapt structurally to the noise induced by the channel in the sense that both the order of speaking, and the length of the protocol may vary depending on observed noise. We define models that capture adaptive protocols and study upper and lower bounds on the permissible noise rate in these models. When the length of the protocol may adaptively change according to the noise, we demonstrate a protocol that tolerates noise rates up to $1/3$. When the order of speaking may adaptively change as well, we demonstrate a protocol that tolerates noise rates up to $2/3$. Hence, adaptivity circumvents an impossibility result of $1/4$ on the fraction of tolerable noise (Braverman and Rao, 2014). △ Less

Submitted 7 August, 2015; v1 submitted 15 December, 2013; originally announced December 2013.

Comments: Content is similar to previous version yet with an improved presentation

arXiv:1308.5045 [pdf, other]

Network Coding meets Decentralized Control: Network Linearization and Capacity-Stabilizablilty Equivalence

Authors: Se Yong Park, Anant Sahai

Abstract: We take a unified view of network coding and decentralized control. Precisely speaking, we consider both as linear time-invariant systems by appropriately restricting channels and coding schemes of network coding to be linear time-invariant, and the plant and controllers of decentralized control to be linear time-invariant as well. First, we apply linear system theory to network coding. This gives… ▽ More We take a unified view of network coding and decentralized control. Precisely speaking, we consider both as linear time-invariant systems by appropriately restricting channels and coding schemes of network coding to be linear time-invariant, and the plant and controllers of decentralized control to be linear time-invariant as well. First, we apply linear system theory to network coding. This gives a novel way of converting an arbitrary relay network to an equivalent acyclic single-hop relay network, which we call Network Linearization. Based on network linearization, we prove that the fundamental design limit, mincut, is achievable by a linear time-invariant network-coding scheme regardless of the network topology. Then, we use the network-coding to view decentralized linear systems. We argue that linear time-invariant controllers in a decentralized linear system "communicate" via linear network coding to stabilize the plant. To justify this argument, we give an algorithm to "externalize" the implicit communication between the controllers that we believe must be occurring to stabilize the plant. Based on this, we show that the stabilizability condition for decentralized linear systems comes from an underlying communication limit, which can be described by the algebraic mincut-maxflow theorem. With this re-interpretation in hand, we also consider stabilizability over LTI networks to emphasize the connection with network coding. In particular, in broadcast and unicast problems, unintended messages at the receivers will be modeled as secrecy constraints. △ Less

Submitted 23 August, 2013; originally announced August 2013.

arXiv:1306.5018 [pdf, other]

doi 10.1109/TIT.2015.2402279

Information embedding and the triple role of control

Authors: Pulkit Grover, Aaron B. Wagner, Anant Sahai

Abstract: We consider the problem of information embedding where the encoder modifies a white Gaussian host signal in a power-constrained manner to encode a message, and the decoder recovers both the embedded message and the modified host signal. This partially extends the recent work of Sumszyk and Steinberg to the continuous-alphabet Gaussian setting. Through a control-theoretic lens, we observe that the… ▽ More We consider the problem of information embedding where the encoder modifies a white Gaussian host signal in a power-constrained manner to encode a message, and the decoder recovers both the embedded message and the modified host signal. This partially extends the recent work of Sumszyk and Steinberg to the continuous-alphabet Gaussian setting. Through a control-theoretic lens, we observe that the problem is a minimalist example of what is called the "triple role" of control actions. We show that a dirty-paper-coding strategy achieves the optimal rate for perfect recovery of the modified host and the message for any message rate. For imperfect recovery of the modified host, by deriving bounds on the minimum mean-square error (MMSE) in recovering the modified host signal, we show that DPC-based strategies are guaranteed to attain within a uniform constant factor of 16 of the optimal weighted sum of power required in host signal modification and the MMSE in the modified host signal reconstruction for all weights and all message rates. When specialized to the zero-rate case, our results provide the tightest known lower bounds on the asymptotic costs for the vector version of a famous open problem in decentralized control: the Witsenhausen counterexample. Numerically, this tighter bound helps us characterize the asymptotically optimal costs for the vector Witsenhausen problem to within a factor of 1.3 for all problem parameters, improving on the earlier best known bound of 2. △ Less

Submitted 20 June, 2013; originally announced June 2013.

Comments: Revised version submitted to IEEE Trans. Info Theory

arXiv:1306.0369 [pdf, other]

DNA Pen: A Tool for Drawing on a Molecular Canvas

Authors: Arnav Goyal, Dixita Limbachiya, Shikhar Kumar Gupta, Foram Joshi, Sushant Pritmani, Akshita Sahai, Manish K Gupta

Abstract: DNA origami is an interdisciplinary area where DNA can be used as a building block for making useful stuff at nanoscale. This work presents an open source software DNA pen (based on the recent work of Peng Yin and his group) which can be used (using free hand and digital molecular canvas) to draw an object at nanoscale. Software generates error free DNA sequences which can be used in the wet lab t… ▽ More DNA origami is an interdisciplinary area where DNA can be used as a building block for making useful stuff at nanoscale. This work presents an open source software DNA pen (based on the recent work of Peng Yin and his group) which can be used (using free hand and digital molecular canvas) to draw an object at nanoscale. Software generates error free DNA sequences which can be used in the wet lab to create the object at the nanoscale. Using DNA pen we have drawn several objects including the map of India and sanskrit letter "Om" from free hand molecular canvas and digital letter DNA using digitized molecular canvas. △ Less

Submitted 3 June, 2013; originally announced June 2013.

Comments: Submitted to DNA19. The software is available at http://www.guptalab.org/dnapen

arXiv:1302.5860 [pdf, other]

A universal, operational theory of unicast multi-user communication with fidelity criteria

Authors: Mukul Agarwal, Sanjoy Mitter, Anant Sahai

Abstract: This is a three part paper. Optimality of source-channel separation for communication with a fidelity criterion when the channel is compound as defined by Csiszar and Korner in their book and general as defined by Verdu and Han, is proved in Part I. It is assumed that random codes are permitted. The word "universal" in the title of this paper refers to the fact that the channel model is compound… ▽ More This is a three part paper. Optimality of source-channel separation for communication with a fidelity criterion when the channel is compound as defined by Csiszar and Korner in their book and general as defined by Verdu and Han, is proved in Part I. It is assumed that random codes are permitted. The word "universal" in the title of this paper refers to the fact that the channel model is compound. The proof uses a layered black-box or a layered input-output view-point. In particular, only the end-to-end description of the channel as being capable of communicating a source to within a certain distortion level is used when proving separation. This implies that the channel model does not play any role for separation to hold as long as there is a source model. Further implications of the layered black-box view-point are discussed. Optimality of source-medium separation for multi-user communication with fidelity criteria over a general, compound medium in the unicast setting is proved in Part II, thus generalizing Part I to the unicast, multi-user setting. Part III gets to an understanding of the question, "Why is a channel which is capable of communicating a source to within a certain distortion level, also capable of communicating bits at any rate less than the infimum of the rates needed to code the source to within the distortion level": this lies at the heart of why optimality of separation for communication with a fidelity criterion holds. The perspective taken to get to this understanding is a randomized covering-packing perspective, and the proof is operational. △ Less

Submitted 23 February, 2013; originally announced February 2013.

arXiv:1302.2185 [pdf, other]

doi 10.1109/TWC.2013.010214.130226

Passive Self-Interference Suppression for Full-Duplex Infrastructure Nodes

Authors: Evan Everett, Achaleshwar Sahai, Ashutosh Sabharwal

Abstract: Recent research results have demonstrated the feasibility of full-duplex wireless communication for short-range links. Although the focus of the previous works has been active cancellation of the self-interference signal, a majority of the overall self-interference suppression is often due to passive suppression, i.e., isolation of the transmit and receive antennas. We present a measurement-based… ▽ More Recent research results have demonstrated the feasibility of full-duplex wireless communication for short-range links. Although the focus of the previous works has been active cancellation of the self-interference signal, a majority of the overall self-interference suppression is often due to passive suppression, i.e., isolation of the transmit and receive antennas. We present a measurement-based study of the capabilities and limitations of three key mechanisms for passive self-interference suppression: directional isolation, absorptive shielding, and cross-polarization. The study demonstrates that more than 70 dB of passive suppression can be achieved in certain environments, but also establishes two results on the limitations of passive suppression: (1) environmental reflections limit the amount of passive suppression that can be achieved, and (2) passive suppression, in general, increases the frequency selectivity of the residual self-interference signal. These results suggest two design implications: (1) deployments of full-duplex infrastructure nodes should minimize near-antenna reflectors, and (2) active cancellation in concatenation with passive suppression should employ higher-order filters or per-subcarrier cancellation. △ Less

Submitted 16 December, 2013; v1 submitted 8 February, 2013; originally announced February 2013.

Comments: 16 pages, Accepted by IEEE Transactions on Wireless Communication, October 2013

arXiv:1212.5462 [pdf, other]

doi 10.1109/ACSSC.2012.6488952

On the Impact of Phase Noise on Active Cancellation in Wireless Full-Duplex

Authors: Achaleshwar Sahai, Gaurav Patel, Chris Dick, Ashutosh Sabharwal

Abstract: Recent experimental results have shown that full-duplex communication is possible for short-range communications. However, extending full-duplex to long-range communication remains a challenge, primarily due to residual self-interference even with a combination of passive suppression and active cancellation methods. In this paper, we investigate the root cause of performance bottlenecks in current… ▽ More Recent experimental results have shown that full-duplex communication is possible for short-range communications. However, extending full-duplex to long-range communication remains a challenge, primarily due to residual self-interference even with a combination of passive suppression and active cancellation methods. In this paper, we investigate the root cause of performance bottlenecks in current full-duplex systems. We first classify all known full-duplex architectures based on how they compute their cancelling signal and where the cancelling signal is injected to cancel self-interference. Based on the classification, we analytically explain several published experimental results. The key bottleneck in current systems turns out to be the phase noise in the local oscillators in the transmit and receive chain of the full-duplex node. As a key by-product of our analysis, we propose signal models for wideband and MIMO full-duplex systems, capturing all the salient design parameters, and thus allowing future analytical development of advanced coding and signal design for full-duplex systems. △ Less

Submitted 21 December, 2012; originally announced December 2012.

Comments: 35 pages, Submitted to IEEE Transactions on Vehicular Technology, Dec 2012

arXiv:1210.5287 [pdf, ps, other]

Attribute-Based Encryption for Circuits from Multilinear Maps

Authors: Amit Sahai, Brent Waters

Abstract: In this work, we provide the first construction of Attribute-Based Encryption (ABE) for general circuits. Our construction is based on the existence of multilinear maps. We prove selective security of our scheme in the standard model under the natural multilinear generalization of the BDDH assumption. Our scheme achieves both Key-Policy and Ciphertext-Policy variants of ABE. In this work, we provide the first construction of Attribute-Based Encryption (ABE) for general circuits. Our construction is based on the existence of multilinear maps. We prove selective security of our scheme in the standard model under the natural multilinear generalization of the BDDH assumption. Our scheme achieves both Key-Policy and Ciphertext-Policy variants of ABE. △ Less

Submitted 18 October, 2012; originally announced October 2012.

arXiv:1210.3719 [pdf, ps, other]

On Constant-Round Concurrent Zero-Knowledge from a Knowledge Assumption

Authors: Divya Gupta, Amit Sahai

Abstract: In this work, we consider the long-standing open question of constructing constant-round concurrent zero-knowledge protocols in the plain model. Resolving this question is known to require non-black-box techniques. We consider non-black-box techniques for zero-knowledge based on knowledge assumptions, a line of thinking initiated by the work of Hada and Tanaka (CRYPTO 1998). Prior to our work, i… ▽ More In this work, we consider the long-standing open question of constructing constant-round concurrent zero-knowledge protocols in the plain model. Resolving this question is known to require non-black-box techniques. We consider non-black-box techniques for zero-knowledge based on knowledge assumptions, a line of thinking initiated by the work of Hada and Tanaka (CRYPTO 1998). Prior to our work, it was not known whether knowledge assumptions could be used for achieving security in the concurrent setting, due to a number of significant limitations that we discuss here. Nevertheless, we obtain the following results: 1. We obtain the first constant round concurrent zero-knowledge argument for \textbf{NP} in the plain model based on a new variant of knowledge of exponent assumption. Furthermore, our construction avoids the inefficiency inherent in previous non-black-box techniques such that those of Barak (FOCS 2001); we obtain our result through an efficient protocol compiler. 2. Unlike Hada and Tanaka, we do not require a knowledge assumption to argue the soundness of our protocol. Instead, we use a discrete log like assumption, which we call Diffie-Hellman Logarithm Assumption, to prove the soundness of our protocol. 3. We give evidence that our new variant of knowledge of exponent assumption is in fact plausible. In particular, we show that our assumption holds in the generic group model. 4. Knowledge assumptions are especially delicate assumptions whose plausibility may be hard to gauge. We give a novel framework to express knowledge assumptions in a more flexible way, which may allow for formulation of plausible assumptions and exploration of their impact and application in cryptography. △ Less

Submitted 13 October, 2012; originally announced October 2012.

Comments: 30 pages, 3 figures

arXiv:1206.3634 [pdf, other]

Balls into Bins: strict Capacities and Edge Weights

Authors: Ankur Sahai

Abstract: We explore a novel theoretical model for studying the performance of distributed storage management systems where the data-centers have limited capacities (as compared to storage space requested by the users). Prior schemes such as Balls-into-bins (used for load balancing) neither consider bin (consumer) capacities (multiple balls into a bin) nor the future performance of the system after, balls (… ▽ More We explore a novel theoretical model for studying the performance of distributed storage management systems where the data-centers have limited capacities (as compared to storage space requested by the users). Prior schemes such as Balls-into-bins (used for load balancing) neither consider bin (consumer) capacities (multiple balls into a bin) nor the future performance of the system after, balls (producer requests) are allocated to bins and restrict number of balls as a function of the number of bins. Our problem consists of finding an optimal assignment of the online producer requests to consumers (via weighted edges) in a complete bipartite graph while ensuring that the total size of request assigned on a consumer is limited by its capacity. The metric used to measure the performance in this model is the (minimization of) weighted sum of the requests assigned on the edges (loads) and their corresponding weights. We first explore the optimal offline algorithms followed by competitive analysis of different online techniques. Using oblivious adversary. LP and Primal-Dual algorithms are used for calculating the optimal offline solution in O(r*n) time (where r and n are the number of requests and consumers respectively) while randomized algorithms are used for the online case. For the simplified model with equal consumer capacities an average-case competitive ratio of AVG(d) / MIN(d) (where d is the edge weight / distance) is achieved using an algorithm that has equal probability for selecting any of the available edges with a running time of $O(r)$. In the extending the model to arbitrary consumer capacities we show an average case competitive ratio of AVG(d*c) / (AVG(c) *MIN(d)). △ Less

Submitted 16 June, 2012; originally announced June 2012.

arXiv:1107.0607 [pdf, other]

Pushing the limits of Full-duplex: Design and Real-time Implementation

Authors: Achaleshwar Sahai, Gaurav Patel, Ashutosh Sabharwal

Abstract: Recent work has shown the feasibility of single-channel full-duplex wireless physical layer, allowing nodes to send and receive in the same frequency band at the same time. In this report, we first design and implement a real-time 64-subcarrier 10 MHz full-duplex OFDM physical layer, FD-PHY. The proposed FD-PHY not only allows synchronous full-duplex transmissions but also selective asynchronous f… ▽ More Recent work has shown the feasibility of single-channel full-duplex wireless physical layer, allowing nodes to send and receive in the same frequency band at the same time. In this report, we first design and implement a real-time 64-subcarrier 10 MHz full-duplex OFDM physical layer, FD-PHY. The proposed FD-PHY not only allows synchronous full-duplex transmissions but also selective asynchronous full-duplex modes. Further, we show that in over-the-air experiments using optimal antenna placement on actual devices, the self-interference can be suppressed upto 80dB, which is 10dB more than prior reported results. Then we propose a full-duplex MAC protocol, FD-MAC, which builds on IEEE 802.11 with three new mechanisms -- shared random backoff, header snooping and virtual backoffs. The new mechanisms allow FD-MAC to discover and exploit full-duplex opportunities in a distributed manner. Our over-the-air tests show over 70% throughput gains from using full-duplex over half-duplex in realistically used cases. △ Less

Submitted 4 July, 2011; originally announced July 2011.

Comments: 12 page Rice University technical report

Report number: TREE1104

arXiv:1105.0233 [pdf, other]

Derandomization of Online Assignment Algorithms for Dynamic Graphs

Authors: Ankur Sahai

Abstract: This paper analyzes different online algorithms for the problem of assigning weights to edges in a fully-connected bipartite graph that minimizes the overall cost while satisfying constraints. Edges in this graph may disappear and reappear over time. Performance of these algorithms is measured using simulations. This paper also attempts to derandomize the randomized online algorithm for this probl… ▽ More This paper analyzes different online algorithms for the problem of assigning weights to edges in a fully-connected bipartite graph that minimizes the overall cost while satisfying constraints. Edges in this graph may disappear and reappear over time. Performance of these algorithms is measured using simulations. This paper also attempts to derandomize the randomized online algorithm for this problem. △ Less

Submitted 1 May, 2011; originally announced May 2011.

arXiv:1105.0232 [pdf, ps, other]

Online Assignment Algorithms for Dynamic Bipartite Graphs

Authors: Ankur Sahai

Abstract: This paper analyzes the problem of assigning weights to edges incrementally in a dynamic complete bipartite graph consisting of producer and consumer nodes. The objective is to minimize the overall cost while satisfying certain constraints. The cost and constraints are functions of attributes of the edges, nodes and online service requests. Novelty of this work is that it models real-time distribu… ▽ More This paper analyzes the problem of assigning weights to edges incrementally in a dynamic complete bipartite graph consisting of producer and consumer nodes. The objective is to minimize the overall cost while satisfying certain constraints. The cost and constraints are functions of attributes of the edges, nodes and online service requests. Novelty of this work is that it models real-time distributed resource allocation using an approach to solve this theoretical problem. This paper studies variants of this assignment problem where the edges, producers and consumers can disappear and reappear or their attributes can change over time. Primal-Dual algorithms are used for solving these problems and their competitive ratios are evaluated. △ Less

Submitted 1 May, 2011; originally announced May 2011.

arXiv:1104.4805 [pdf, other]

doi 10.1109/TIT.2013.2278691

Capacity of All Nine Models of Channel Output Feedback for the Two-user Interference Channel

Authors: Achaleshwar Sahai, Vaneet Aggarwal, Melda Yuksel, Ashutosh Sabharwal

Abstract: In this paper, we study the impact of different channel output feedback architectures on the capacity of the two-user interference channel. For a two-user interference channel, a feedback link can exist between receivers and transmitters in 9 canonical architectures (see Fig. 2), ranging from only one feedback link to four feedback links. We derive the exact capacity region for the symmetric deter… ▽ More In this paper, we study the impact of different channel output feedback architectures on the capacity of the two-user interference channel. For a two-user interference channel, a feedback link can exist between receivers and transmitters in 9 canonical architectures (see Fig. 2), ranging from only one feedback link to four feedback links. We derive the exact capacity region for the symmetric deterministic interference channel and the constant-gap capacity region for the symmetric Gaussian interference channel for all of the 9 architectures. We show that for a linear deterministic symmetric interference channel, in the weak interference regime, all models of feedback, except the one, which has only one of the receivers feeding back to its own transmitter, have the identical capacity region. When only one of the receivers feeds back to its own transmitter, the capacity region is a strict subset of the capacity region of the rest of the feedback models in the weak interference regime. However, the sum-capacity of all feedback models is identical in the weak interference regime. Moreover, in the strong interference regime all models of feedback with at least one of the receivers feeding back to its own transmitter have the identical sum-capacity. For the Gaussian interference channel, the results of the linear deterministic model follow, where capacity is replaced with approximate capacity. △ Less

Submitted 25 January, 2013; v1 submitted 25 April, 2011; originally announced April 2011.

Comments: submitted to IEEE Transactions on Information Theory, results improved by deriving capacity region of all 9 canonical feedback models in two-user interference channel

Journal ref: IEEE Transactions on Information Theory, vol.59, no.11, pp.6957,6979, Nov. 2013

arXiv:1104.0739 [pdf, ps, other]

Potent Tree Codes and their applications: Coding for Interactive Communication, revisited

Authors: Ran Gelles, Amit Sahai

Abstract: We study the fundamental problem of reliable interactive communication over a noisy channel. In a breakthrough sequence of papers published in 1992 and 1993, Schulman gave non-constructive proofs of the existence of general methods to emulate any two-party interactive protocol such that: (1) the emulation protocol takes a constant-factor longer than the original protocol, and (2) if the emulation… ▽ More We study the fundamental problem of reliable interactive communication over a noisy channel. In a breakthrough sequence of papers published in 1992 and 1993, Schulman gave non-constructive proofs of the existence of general methods to emulate any two-party interactive protocol such that: (1) the emulation protocol takes a constant-factor longer than the original protocol, and (2) if the emulation protocol is executed over a noisy channel, then the probability that the emulation protocol fails is exponentially small in the total length of the protocol. Unfortunately, Schulman's emulation procedures either only work in a model with a large amount of shared randomness, or are non-constructive in that they rely on the existence of good tree codes. The only known proofs of the existence of good tree codes are non-constructive, and finding an explicit construction remains an important open problem. Indeed, randomly generated tree codes are not good tree codes with overwhelming probability. In this work, we revisit the problem of reliable interactive communication, and obtain the following results: We introduce a new notion of goodness for a tree code, and define the notion of a potent tree code. We believe that this notion is of independent interest. We prove the correctness of an explicit emulation procedure based on any potent tree code. We show that a randomly generated tree code (with suitable constant alphabet size) is a potent tree code with overwhelming probability. Furthermore we are able to partially derandomize this result using only O(n) random bits, where $n$ is the depth of the tree. These results allow us to obtain the first fully explicit emulation procedure for reliable interactive communication over noisy channels with a constant communication overhead, and exponentially small failure probability. △ Less

Submitted 5 April, 2011; originally announced April 2011.

Comments: 26 pages, 6 figures

arXiv:1010.4855 [pdf, other]

Towards a communication-theoretic understanding of system-level power consumption

Authors: Pulkit Grover, Kristen Ann Woyach, Anant Sahai

Abstract: Traditional communication theory focuses on minimizing transmit power. However, communication links are increasingly operating at shorter ranges where transmit power can be significantly smaller than the power consumed in decoding. This paper models the required decoding power and investigates the minimization of total system power from two complementary perspectives. First, an isolated point-to… ▽ More Traditional communication theory focuses on minimizing transmit power. However, communication links are increasingly operating at shorter ranges where transmit power can be significantly smaller than the power consumed in decoding. This paper models the required decoding power and investigates the minimization of total system power from two complementary perspectives. First, an isolated point-to-point link is considered. Using new lower bounds on the complexity of message-passing decoding, lower bounds are derived on decoding power. These bounds show that 1) there is a fundamental tradeoff between transmit and decoding power; 2) unlike the implications of the traditional "waterfall" curve which focuses on transmit power, the total power must diverge to infinity as error probability goes to zero; 3) Regular LDPCs, and not their known capacity-achieving irregular counterparts, can be shown to be power order optimal in some cases; and 4) the optimizing transmit power is bounded away from the Shannon limit. Second, we consider a collection of links. When systems both generate and face interference, coding allows a system to support a higher density of transmitter-receiver pairs (assuming interference is treated as noise). However, at low densities, uncoded transmission may be more power-efficient in some cases. △ Less

Submitted 16 February, 2011; v1 submitted 23 October, 2010; originally announced October 2010.

Comments: 24 pages, 13 figures, revision of our submission to JSAC Special issue on energy-efficient wireless communications

arXiv:1010.4854 [pdf, other]

Implicit and explicit communication in decentralized control

Authors: Pulkit Grover, Anant Sahai

Abstract: There has been substantial progress recently in understanding toy problems of purely implicit signaling. These are problems where the source and the channel are implicit -- the message is generated endogenously by the system, and the plant itself is used as a channel. In this paper, we explore how implicit and explicit communication can be used synergistically to reduce control costs. The setting… ▽ More There has been substantial progress recently in understanding toy problems of purely implicit signaling. These are problems where the source and the channel are implicit -- the message is generated endogenously by the system, and the plant itself is used as a channel. In this paper, we explore how implicit and explicit communication can be used synergistically to reduce control costs. The setting is an extension of Witsenhausen's counterexample where a rate-limited external channel connects the two controllers. Using a semi-deterministic version of the problem, we arrive at a binning-based strategy that can outperform the best known strategies by an arbitrarily large factor. We also show that our binning-based strategy attains within a constant factor of the optimal cost for an asymptotically infinite-length version of the problem uniformly over all problem parameters and all rates on the external channel. For the scalar case, although our results yield approximate optimality for each fixed rate, we are unable to prove approximately-optimality uniformly over all rates. △ Less

Submitted 23 October, 2010; originally announced October 2010.

Comments: Presented at Allerton'10

arXiv:1009.2528 [pdf, other]

Is Witsenhausen's counterexample a relevant toy?

Authors: Pulkit Grover, Anant Sahai

Abstract: This paper answers a question raised by Doyle on the relevance of the Witsenhausen counterexample as a toy decentralized control problem. The question has two sides, the first of which focuses on the lack of an external channel in the counterexample. Using existing results, we argue that the core difficulty in the counterexample is retained even in the presence of such a channel. The second side q… ▽ More This paper answers a question raised by Doyle on the relevance of the Witsenhausen counterexample as a toy decentralized control problem. The question has two sides, the first of which focuses on the lack of an external channel in the counterexample. Using existing results, we argue that the core difficulty in the counterexample is retained even in the presence of such a channel. The second side questions the LQG formulation of the counterexample. We consider alternative formulations and show that the understanding developed for the LQG case guides the investigation for these other cases as well. Specifically, we consider 1) a variation on the original counterexample with general, but bounded, noise distributions, and 2) an adversarial extension with bounded disturbance and quadratic costs. For each of these formulations, we show that quantization-based nonlinear strategies outperform linear strategies by an arbitrarily large factor. Further, these nonlinear strategies also perform within a constant factor of the optimal, uniformly over all possible parameter choices (for fixed noise distributions in the Bayesian case). Fortuitously, the assumption of bounded noise results in a significant simplification of proofs as compared to those for the LQG formulation. Therefore, the results in this paper are also of pedagogical interest. △ Less

Submitted 13 September, 2010; originally announced September 2010.

Comments: preprint for paper that will appear in proceedings of 49th IEEE Conference on Decision and Control (CDC) 2010, Atlanta, Georgia

arXiv:1003.0520 [pdf, other]

doi 10.1109/ITWKSPS.2010.5503206

Information embedding meets distributed control

Authors: Pulkit Grover, Aaron B. Wagner, Anant Sahai

Abstract: We consider the problem of information embedding where the encoder modifies a white Gaussian host signal in a power-constrained manner to encode the message, and the decoder recovers both the embedded message and the modified host signal. This extends the recent work of Sumszyk and Steinberg to the continuous-alphabet Gaussian setting. We show that a dirty-paper-coding based strategy achieves th… ▽ More We consider the problem of information embedding where the encoder modifies a white Gaussian host signal in a power-constrained manner to encode the message, and the decoder recovers both the embedded message and the modified host signal. This extends the recent work of Sumszyk and Steinberg to the continuous-alphabet Gaussian setting. We show that a dirty-paper-coding based strategy achieves the optimal rate for perfect recovery of the modified host and the message. We also provide bounds for the extension wherein the modified host signal is recovered only to within a specified distortion. When specialized to the zero-rate case, our results provide the tightest known lower bounds on the asymptotic costs for the vector version of a famous open problem in distributed control -- the Witsenhausen counterexample. Using this bound, we characterize the asymptotically optimal costs for the vector Witsenhausen problem numerically to within a factor of 1.3 for all problem parameters, improving on the earlier best known bound of 2. △ Less

Submitted 2 March, 2010; originally announced March 2010.

Comments: 19 pages, 7 figures. Presented at ITW'10. Submitted to IEEE Transactions on Information Theory

arXiv:1003.0514 [pdf, other]

doi 10.1109/WIOPT.2009.5291559

The finite-dimensional Witsenhausen counterexample

Authors: Pulkit Grover, Se Yong Park, Anant Sahai

Abstract: Recently, a vector version of Witsenhausen's counterexample was considered and it was shown that in that limit of infinite vector length, certain quantization-based control strategies are provably within a constant factor of the optimal cost for all possible problem parameters. In this paper, finite vector lengths are considered with the dimension being viewed as an additional problem parameter.… ▽ More Recently, a vector version of Witsenhausen's counterexample was considered and it was shown that in that limit of infinite vector length, certain quantization-based control strategies are provably within a constant factor of the optimal cost for all possible problem parameters. In this paper, finite vector lengths are considered with the dimension being viewed as an additional problem parameter. By applying a large-deviation "sphere-packing" philosophy, a lower bound to the optimal cost for the finite dimensional case is derived that uses appropriate shadows of the infinite-length bound. Using the new lower bound, we show that good lattice-based control strategies achieve within a constant factor of the optimal cost uniformly over all possible problem parameters, including the vector length. For Witsenhausen's original problem -- the scalar case -- the gap between regular lattice-based strategies and the lower bound is numerically never more than a factor of 8. △ Less

Submitted 2 March, 2010; originally announced March 2010.

Comments: 32 pages, 7 figures, 1 table. Presented at ConCom 2009, Seoul, Korea. Submitted to IEEE Transactions on Automatic Control

arXiv:0911.3872 [pdf, ps, other]

Equivalence perspectives in communication, source-channel connections and universal source-channel separation

Authors: Mukul Agarwal, Anant Sahai, Sanjoy Mitter

Abstract: An operational perspective is used to understand the relationship between source and channel coding. This is based on a direct reduction of one problem to another that uses random coding (and hence common randomness) but unlike all prior work, does not involve any functional computations, in particular, no mutual-information computations. This result is then used to prove a universal source-chan… ▽ More An operational perspective is used to understand the relationship between source and channel coding. This is based on a direct reduction of one problem to another that uses random coding (and hence common randomness) but unlike all prior work, does not involve any functional computations, in particular, no mutual-information computations. This result is then used to prove a universal source-channel separation theorem in the rate-distortion context where universality is in the sense of a compound ``general channel.'' △ Less

Submitted 19 November, 2009; originally announced November 2009.

arXiv:0905.2392 [pdf, other]

On Channel Output Feedback in Deterministic Interference Channels

Authors: Achaleshwar Sahai, Vaneet Aggarwal, Melda Yuksel, Ashutosh Sabharwal

Abstract: In this paper, we study the effect of channel output feedback on the sum capacity in a two-user symmetric deterministic interference channel. We find that having a single feedback link from one of the receivers to its own transmitter results in the same sum capacity as having a total of 4 feedback links from both the receivers to both the transmitters. Hence, from the sum capacity point of view,… ▽ More In this paper, we study the effect of channel output feedback on the sum capacity in a two-user symmetric deterministic interference channel. We find that having a single feedback link from one of the receivers to its own transmitter results in the same sum capacity as having a total of 4 feedback links from both the receivers to both the transmitters. Hence, from the sum capacity point of view, the three additional feedback links are not helpful. We also consider a half-duplex feedback model where the forward and the feedback resources are symmetric and timeshared. Surprisingly, we find that there is no gain in sum-capacity with feedback in a half-duplex feedback model when interference links have more capacity than direct links. △ Less

Submitted 31 August, 2009; v1 submitted 14 May, 2009; originally announced May 2009.

Comments: To appear in IEEE ITW, Oct. 2009

arXiv:0811.0475 [pdf, ps, other]

Secure Arithmetic Computation with No Honest Majority

Authors: Yuval Ishai, Manoj Prabhakaran, Amit Sahai

Abstract: We study the complexity of securely evaluating arithmetic circuits over finite rings. This question is motivated by natural secure computation tasks. Focusing mainly on the case of two-party protocols with security against malicious parties, our main goals are to: (1) only make black-box calls to the ring operations and standard cryptographic primitives, and (2) minimize the number of such black… ▽ More We study the complexity of securely evaluating arithmetic circuits over finite rings. This question is motivated by natural secure computation tasks. Focusing mainly on the case of two-party protocols with security against malicious parties, our main goals are to: (1) only make black-box calls to the ring operations and standard cryptographic primitives, and (2) minimize the number of such black-box calls as well as the communication overhead. We present several solutions which differ in their efficiency, generality, and underlying intractability assumptions. These include: 1. An unconditionally secure protocol in the OT-hybrid model which makes a black-box use of an arbitrary ring $R$, but where the number of ring operations grows linearly with (an upper bound on) $\log|R|$. 2. Computationally secure protocols in the OT-hybrid model which make a black-box use of an underlying ring, and in which the number of ring operations does not grow with the ring size. These results extend a previous approach of Naor and Pinkas for secure polynomial evaluation (SIAM J. Comput., 35(5), 2006). 3. A protocol for the rings $\mathbb{Z}_m=\mathbb{Z}/m\mathbb{Z}$ which only makes a black-box use of a homomorphic encryption scheme. When $m$ is prime, the (amortized) number of calls to the encryption scheme for each gate of the circuit is constant. All of our protocols are in fact UC-secure in the OT-hybrid model and can be generalized to multiparty computation with an arbitrary number of malicious parties. △ Less

Submitted 8 November, 2008; v1 submitted 4 November, 2008; originally announced November 2008.

Comments: minor editorial changes

arXiv:0805.2423 [pdf, ps, other]

Green Codes: Energy-Efficient Short-Range Communication

Authors: Pulkit Grover, Anant Sahai

Abstract: A green code attempts to minimize the total energy per-bit required to communicate across a noisy channel. The classical information-theoretic approach neglects the energy expended in processing the data at the encoder and the decoder and only minimizes the energy required for transmissions. Since there is no cost associated with using more degrees of freedom, the traditionally optimal strategy… ▽ More A green code attempts to minimize the total energy per-bit required to communicate across a noisy channel. The classical information-theoretic approach neglects the energy expended in processing the data at the encoder and the decoder and only minimizes the energy required for transmissions. Since there is no cost associated with using more degrees of freedom, the traditionally optimal strategy is to communicate at rate zero. In this work, we use our recently proposed model for the power consumed by iterative message passing. Using generalized sphere-packing bounds on the decoding power, we find lower bounds on the total energy consumed in the transmissions and the decoding, allowing for freedom in the choice of the rate. We show that contrary to the classical intuition, the rate for green codes is bounded away from zero for any given error probability. In fact, as the desired bit-error probability goes to zero, the optimizing rate for our bounds converges to 1. △ Less

Submitted 15 May, 2008; originally announced May 2008.

Comments: Accepted at International Symposium on Information Theory (ISIT) 2008

arXiv:0801.0352 [pdf, ps, other]

The price of certainty: "waterslide curves" and the gap to capacity

Authors: Anant Sahai, Pulkit Grover

Abstract: The classical problem of reliable point-to-point digital communication is to achieve a low probability of error while keeping the rate high and the total power consumption small. Traditional information-theoretic analysis uses `waterfall' curves to convey the revolutionary idea that unboundedly low probabilities of bit-error are attainable using only finite transmit power. However, practitioners… ▽ More The classical problem of reliable point-to-point digital communication is to achieve a low probability of error while keeping the rate high and the total power consumption small. Traditional information-theoretic analysis uses `waterfall' curves to convey the revolutionary idea that unboundedly low probabilities of bit-error are attainable using only finite transmit power. However, practitioners have long observed that the decoder complexity, and hence the total power consumption, goes up when attempting to use sophisticated codes that operate close to the waterfall curve. This paper gives an explicit model for power consumption at an idealized decoder that allows for extreme parallelism in implementation. The decoder architecture is in the spirit of message passing and iterative decoding for sparse-graph codes. Generalized sphere-packing arguments are used to derive lower bounds on the decoding power needed for any possible code given only the gap from the Shannon limit and the desired probability of error. As the gap goes to zero, the energy per bit spent in decoding is shown to go to infinity. This suggests that to optimize total power, the transmitter should operate at a power that is strictly above the minimum demanded by the Shannon capacity. The lower bound is plotted to show an unavoidable tradeoff between the average bit-error probability and the total power used in transmission and decoding. In the spirit of conventional waterfall curves, we call these `waterslide' curves. △ Less

Submitted 2 January, 2008; originally announced January 2008.

Comments: 37 pages, 13 figures. Submitted to IEEE Transactions on Information Theory. This version corrects a subtle bug in the proofs of the original submission and improves the bounds significantly

Report number: UCB/EECS-2008-1

Showing 1–50 of 66 results for author: Sahai, A