Search | arXiv e-print repository

Asymptotics of the Sketched Pseudoinverse

Authors: Daniel LeJeune, Pratik Patil, Hamid Javadi, Richard G. Baraniuk, Ryan J. Tibshirani

Abstract: We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise… ▽ More We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise characterization of the equivalence even under negative regularization, including a precise characterization of the smallest nonzero eigenvalue of the sketched matrix, which may be of independent interest. We then further characterize the second-order equivalence of the sketched pseudoinverse. We also apply our results to the analysis of the sketch-and-project method and to sketched ridge regression. Lastly, we prove that these results generalize to asymptotically free sketching matrices, obtaining the resulting equivalence for orthogonal sketching matrices and comparing our results to several common sketches used in practice. △ Less

Submitted 6 October, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

Comments: 45 pages, 9 figures

MSC Class: 15B52; 46L54; 62J07

arXiv:2205.14055 [pdf, other]

A Blessing of Dimensionality in Membership Inference through Regularization

Authors: Jasper Tan, Daniel LeJeune, Blake Mason, Hamid Javadi, Richard G. Baraniuk

Abstract: Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. Howev… ▽ More Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. However, remarkably, we then show that if coupled with proper regularization, increasing the number of parameters of a model can actually simultaneously increase both its privacy and performance, thereby eliminating the privacy--utility trade-off. Theoretically, we demonstrate this curious phenomenon for logistic regression with ridge regularization in a bi-level feature ensemble setting. Pursuant to our theoretical exploration, we develop a novel leave-one-out analysis tool to precisely characterize the vulnerability of a linear classifier to the optimal membership inference attack. We empirically exhibit this "blessing of dimensionality" for neural networks on a variety of tasks using early stopping as the regularizer. △ Less

Submitted 13 April, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

Comments: 26 pages, 14 figures

arXiv:2202.01243 [pdf, other]

Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference

Authors: Jasper Tan, Blake Mason, Hamid Javadi, Richard G. Baraniuk

Abstract: A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overp… ▽ More A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overparameterization: the fact that overparameterized models may be more vulnerable to privacy attacks, in particular the membership inference attack that predicts the (potentially sensitive) examples used to train a model. We significantly extend the relatively few empirical results on this problem by theoretically proving for an overparameterized linear regression model in the Gaussian data setting that membership inference vulnerability increases with the number of parameters. Moreover, a range of empirical studies indicates that more complex, nonlinear models exhibit the same behavior. Finally, we extend our analysis towards ridge-regularized linear regression and show in the Gaussian data setting that increased regularization also increases membership inference vulnerability in the overparameterized regime. △ Less

Submitted 30 November, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

Comments: 25 pages, 8 figures

arXiv:2111.04653

Adaptive Steganography Based on bargain Game

Authors: Behbod Keshavarzi, Hamidreza Navidi, Parvaneh Asghari, Seyyed Hamid Haji Seyyed Javadi

Abstract: The capacity and security of the confidential message on the channel are two important challenges in steganography. In this paper, a new block steganography model is presented using the bargain method so that a competitive model is introduced. In this game, the blocks are the same players. The bargain is provided with the aim of embedding information without reducing capacity as well as increasing… ▽ More The capacity and security of the confidential message on the channel are two important challenges in steganography. In this paper, a new block steganography model is presented using the bargain method so that a competitive model is introduced. In this game, the blocks are the same players. The bargain is provided with the aim of embedding information without reducing capacity as well as increasing security. The proposed model shows that it can be used both of the special domain and the transform domain, which are two important methods of steganography. For this purpose, an example of a special domain model is introduced in which, In the first step, the image is divided into $n \times n$ blocks, and in the second step using the graph coloring algorithm, pixels are considered to embed confidential information in each block. In the third step, regarding the bargaining method in game theory, each block plays the role of a player, that the competition between players is based on the defined goal function, and in the best blocks in terms of two criteria of capacity and security, which here means each block has a higher security-to-capacity ratio, so it has a higher priority, which is determined based on the bargaining model. Also, information embedded in LSB two bits. An example of a conversion domain method is also shows that security increases without decreasing in capacity. The conclusion is evaluated by three criteria: PSNR, histogram, and $ε-secure$ also, 2000 standard images were evaluated and observed that the proposed method improves the block methods of embedding information. △ Less

Submitted 10 May, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

Comments: This paper requires major changes and is being revised

arXiv:2110.00917 [pdf]

Binary code optimization

Authors: Parviz Gharehbagheri, Sayeed Hamid Haji Sayeed Javadi, Parvaneh Asghari, Naser Gharehbagheri

Abstract: This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords. Therefore, by replacing the new codewords, the binary data becomes another binary data regarding the intended goals. One of these goals is to reduce data size. I… ▽ More This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords. Therefore, by replacing the new codewords, the binary data becomes another binary data regarding the intended goals. One of these goals is to reduce data size. It means that instead of the original codewords of each binary data, it replaced the Huffman codewords to reduce the data size. One of the features of this method is the result of positive compression for any type of binary data, that is, regardless of the size of the code table, the difference between the original data size and the data size after compression will be greater than or equal to zero. Another important and practical feature of this method is the use of symmetric codewords instead of the suggested codewords in order to create symmetry, reversibility and error resistance properties with two-way decoding. △ Less

Submitted 2 October, 2021; originally announced October 2021.

Comments: in Persian language

arXiv:2106.07769 [pdf, other]

The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization

Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

Abstract: Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy t… ▽ More Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy that adapts to the weights in a monotonic way corresponds to an effective subquadratic regularization penalty, and therefore leads to sparse solutions. We obtain the effective penalties for several popular sparsification strategies, which are remarkably similar to classical penalties commonly used in sparse optimization. Considering variational dropout as a case study, we demonstrate similar empirical behavior between the adaptive dropout method and classical methods on the task of deep network sparsification, validating our theory. △ Less

Submitted 3 January, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: 19 pages, 2 figures. Appeared in NeurIPS 2021. Small typographical correction

arXiv:2104.07714 [pdf]

doi 10.22042/isecure.2020.226400.535

Providing a hybrid cryptography algorithm for lightweight authentication protocol in RFID with urban traffic usage case

Authors: V. Chegeni, H. Haj Seyyed javadi, M. R Moazami Goudarzi, A. Rezakhani

Abstract: Today, the Internet of Things (IoT) is one of the emerging technologies that enable the connection and transfer of information through communication networks. The main idea of the IoT is the widespread presence of objects such as mobile devices, sensors, and RFID. With the increase in traffic volume in urban areas, the existing intelligent urban traffic management system based on IoT can be vital.… ▽ More Today, the Internet of Things (IoT) is one of the emerging technologies that enable the connection and transfer of information through communication networks. The main idea of the IoT is the widespread presence of objects such as mobile devices, sensors, and RFID. With the increase in traffic volume in urban areas, the existing intelligent urban traffic management system based on IoT can be vital. Therefore, this paper focused on security in urban traffic based on using RFID. In our scheme, RFID tags chose as the purpose of this article. We, in this paper, present a mutual authentication protocol that leads to privacy based on hybrid cryptography. Also, an authentication process with RFID tags is proposed that can be read at high speed. The protocol has attempted to reduce the complexity of computing. At the same time, the proposed method can withstand attacks such as spoofing of tag and reader, tag tracking, and replay attack. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Comments: 10 pages,2 figures

arXiv:2102.13340 [pdf]

Lightweight Key-Dependent Dynamic S-Boxes based on Hyperelliptic Curve for IoT Devices

Authors: Parvaneh Asghari, Seyyed Hamid Haj Seyyed Javadi

Abstract: Security is one of the main issues in Internet of Things (IoT). Encryption plays a curtail role in making these systems secure. Substitution Box (S-Box) has an effective impact in block encryption methods. Due to the restricted resource capacities of IoT nodes, providing a lightweight S-Box is a challenging problem. This paper presents a key-dependent S-Box using Hyperelliptic curve. The proposed… ▽ More Security is one of the main issues in Internet of Things (IoT). Encryption plays a curtail role in making these systems secure. Substitution Box (S-Box) has an effective impact in block encryption methods. Due to the restricted resource capacities of IoT nodes, providing a lightweight S-Box is a challenging problem. This paper presents a key-dependent S-Box using Hyperelliptic curve. The proposed S-Box is analytically evaluated using performance criteria including bijection, nonlinearity, strict avalanche effect, and algebraic degree. The evaluation results endorse that the offered S-Box production algorithm is considerably an effective way to generate cryptographic strong S-Box. △ Less

Submitted 26 February, 2021; originally announced February 2021.

Comments: in Persian language

arXiv:2102.07137 [pdf]

A novel key pre-distribution scheme based on $μ$-PBIBD combinatorial design in the resource-constrained IoT network

Authors: Akbar Morshed Aski, Hamid Haj Seyyed Javadi

Abstract: In a resource-constrained IoT network, end nodes like WSN, RFID, and embedded systems are used which have memory, processing, and energy limitations. One of the key distribution solutions in these types of networks is to use the key pre-distribution scheme, which accomplishes the key distribution operation offline before the resource-constrained devices deployment in the environment. Also, in orde… ▽ More In a resource-constrained IoT network, end nodes like WSN, RFID, and embedded systems are used which have memory, processing, and energy limitations. One of the key distribution solutions in these types of networks is to use the key pre-distribution scheme, which accomplishes the key distribution operation offline before the resource-constrained devices deployment in the environment. Also, in order to reduce the shared key discovery computing and communication overhead, the use of combinatorial design in key pre-distribution has been proposed as a solution in recent years. In this study, a $μ$-PBIBD combinatorial design is introduced and constructed and the mapping of such design as a key pre-distribution scheme in the resource-constrained IoT network is explained. Through using such key pre-distribution scheme, more keys are obtained for communication between two devices in the IoT network. This means that there will be a maximum of q + 2 keys between the two devices in the network, where q is the prime power, that is, instead of having a common key for a direct secure connection, the two devices can have q + 2 common keys in their key chain. Accordingly, we would increase the resilience of the key pre-distribution scheme compared to the SBIBD, TD, Trade-KP, UKP *, RD * and 2-D $μ$-PBIBD designs. Keywords: resource-constrained IoT network; combinatorial design; $μ$-PBIBD; resilience. △ Less

Submitted 14 February, 2021; originally announced February 2021.

Comments: in Persian language

Journal ref: The CSI Journal on Computing Science and Information Technology Vol. 18, No.1, 2020

arXiv:1910.04743 [pdf, other]

The Implicit Regularization of Ordinary Least Squares Ensembles

Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

Abstract: Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear p… ▽ More Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear predictors, where each individual predictor is fit using ordinary least squares on a random submatrix of the data matrix. We show that, under standard Gaussianity assumptions, when the number of features selected for each predictor is optimally tuned, the asymptotic risk of a large ensemble is equal to the asymptotic ridge regression risk, which is known to be optimal among linear predictors in this setting. In addition to eliciting this implicit regularization that results from subsampling, we also connect this ensemble to the dropout technique used in training deep (neural) networks, another strategy that has been shown to have a ridge-like regularizing effect. △ Less

Submitted 24 March, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

Comments: 18 pages, 4 figures. To appear in AISTATS 2020

arXiv:1905.11639 [pdf, other]

Implicit Rugosity Regularization via Data Augmentation

Authors: Daniel LeJeune, Randall Balestriero, Hamid Javadi, Richard G. Baraniuk

Abstract: Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit… ▽ More Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit or implicit) regularization in these networks is of great importance. In this work, we explore how the oft-used heuristic of \emph{data augmentation} imposes an {\em implicit regularization} penalty of a novel measure of the \emph{rugosity} or "roughness" based on the tangent Hessian of the function fit to the training data. △ Less

Submitted 10 October, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

Comments: 15 pages, 12 figures

arXiv:1803.04464 [pdf, other]

False Discovery Rate Control via Debiased Lasso

Authors: Adel Javanmard, Hamid Javadi

Abstract: We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples $(p>n)$, we propose a procedure for variables selection and prove that it controls… ▽ More We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples $(p>n)$, we propose a procedure for variables selection and prove that it controls the "directional" false discovery rate (FDR) below a pre-assigned significance level $q\in [0,1]$. We further analyze the statistical power of our framework and show that for designs with subgaussian rows and a common precision matrix $Ω\in\mathbb{R}^{p\times p}$, if the minimum nonzero parameter $θ_{\min}$ satisfies $$\sqrt{n} θ_{\min} - σ\sqrt{2(\max_{i\in [p]}Ω_{ii})\log\left(\frac{2p}{qs_0}\right)} \to \infty\,,$$ then this procedure achieves asymptotic power one. Our framework is built upon the debiasing approach and assumes the standard condition $s_0 = o(\sqrt{n}/(\log p)^2)$, where $s_0$ indicates the number of true positives among the $p$ features. Notably, this framework achieves exact directional FDR control without any assumption on the amplitude of unknown regression parameters, and does not require any knowledge of the distribution of covariates or the noise level. We test our method in synthetic and real data experiments to assess its performance and to corroborate our theoretical results. △ Less

Submitted 19 March, 2019; v1 submitted 12 March, 2018; originally announced March 2018.

Comments: accepted for publication in the Electronic Journal of statistics

arXiv:1711.02053 [pdf, other]

doi 10.1007/978-3-319-74412-4_7

Detecting Community Structure in Dynamic Social Networks Using the Concept of Leadership

Authors: Saeed Haji Seyed Javadi, Pedram Gharani, Shahram Khadivi

Abstract: Detecting community structure in social networks is a fundamental problem empowering us to identify groups of actors with similar interests. There have been extensive works focusing on finding communities in static networks, however, in reality, due to dynamic nature of social networks, they are evolving continuously. Ignoring the dynamic aspect of social networks, neither allows us to capture evo… ▽ More Detecting community structure in social networks is a fundamental problem empowering us to identify groups of actors with similar interests. There have been extensive works focusing on finding communities in static networks, however, in reality, due to dynamic nature of social networks, they are evolving continuously. Ignoring the dynamic aspect of social networks, neither allows us to capture evolutionary behavior of the network nor to predict the future status of individuals. Aside from being dynamic, another significant characteristic of real-world social networks is the presence of leaders, i.e. nodes with high degree centrality having a high attraction to absorb other members and hence to form a local community. In this paper, we devised an efficient method to incrementally detect communities in highly dynamic social networks using the intuitive idea of importance and persistence of community leaders over time. Our proposed method is able to find new communities based on the previous structure of the network without recomputing them from scratch. This unique feature, enables us to efficiently detect and track communities over time rapidly. Experimental results on the synthetic and real-world social networks demonstrate that our method is both effective and efficient in discovering communities in dynamic social networks. △ Less

Submitted 6 November, 2017; originally announced November 2017.

arXiv:1710.02196 [pdf, other]

Porcupine Neural Networks: (Almost) All Local Optima are Global

Authors: Soheil Feizi, Hamid Javadi, Jesse Zhang, David Tse

Abstract: Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical pro… ▽ More Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical properties while at the same time, be a good approximation for the unconstrained one. For two-layer neural networks, we provide affirmative answers to these questions by introducing Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN. △ Less

Submitted 5 October, 2017; originally announced October 2017.

arXiv:1705.02994 [pdf, other]

Non-negative Matrix Factorization via Archetypal Analysis

Authors: Hamid Javadi, Andrea Montanari

Abstract: Given a collection of data points, non-negative matrix factorization (NMF) suggests to express them as convex combinations of a small set of `archetypes' with non-negative entries. This decomposition is unique only if the true archetypes are non-negative and sufficiently sparse (or the weights are sufficiently sparse), a regime that is captured by the separability condition and its generalizations… ▽ More Given a collection of data points, non-negative matrix factorization (NMF) suggests to express them as convex combinations of a small set of `archetypes' with non-negative entries. This decomposition is unique only if the true archetypes are non-negative and sufficiently sparse (or the weights are sufficiently sparse), a regime that is captured by the separability condition and its generalizations. In this paper, we study an approach to NMF that can be traced back to the work of Cutler and Breiman (1994) and does not require the data to be separable, while providing a generally unique decomposition. We optimize the trade-off between two objectives: we minimize the distance of the data points from the convex envelope of the archetypes (which can be interpreted as an empirical risk), while minimizing the distance of the archetypes from the convex envelope of the data (which can be interpreted as a data-dependent regularization). The archetypal analysis method of (Cutler, Breiman, 1994) is recovered as the limiting case in which the last term is given infinite weight. We introduce a `uniqueness condition' on the data which is necessary for exactly recovering the archetypes from noiseless data. We prove that, under uniqueness (plus additional regularity conditions on the geometry of the archetypes), our estimator is robust. While our approach requires solving a non-convex optimization problem, we find that standard optimization methods succeed in finding good solutions both for real and synthetic data. △ Less

Submitted 8 May, 2017; originally announced May 2017.

Comments: 39 pages; 11 pdf figures

arXiv:1511.05254 [pdf, other]

A Statistical Model for Motifs Detection

Authors: Hamid Javadi, Andrea Montanari

Abstract: We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as `motifs.' Within this model, one or multiple copies of a su… ▽ More We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as `motifs.' Within this model, one or multiple copies of a subgraph is added (`planted') in an Erdős-Renyi random graph with $n$ vertices and edge probability $q_0$. We ask whether the resulting graph can be distinguished reliably from a pure Erdős-Renyi random graph, and we present two types of result. First we investigate the question from a purely statistical perspective, and ask whether there is any test that can distinguish between the two graph models. We provide necessary and sufficient conditions that are essentially tight for small enough subgraphs. Next we study two polynomial-time algorithms for solving the same problem: a spectral algorithm, and a semidefinite programming (SDP) relaxation. For the spectral algorithm, we establish sufficient conditions under which it distinguishes the two graph models with high probability. Under the same conditions the spectral algorithm indeed identifies the hidden subgraph. The spectral algorithm is substantially sub-optimal with respect to the optimal test. We show that a similar gap is present for the more sophisticated SDP approach. △ Less

Submitted 20 October, 2017; v1 submitted 16 November, 2015; originally announced November 2015.

Comments: 40 pages, 1 pdf figure

Showing 1–16 of 16 results for author: Javadi, H