-
Asymptotics of the Sketched Pseudoinverse
Authors:
Daniel LeJeune,
Pratik Patil,
Hamid Javadi,
Richard G. Baraniuk,
Ryan J. Tibshirani
Abstract:
We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise…
▽ More
We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise characterization of the equivalence even under negative regularization, including a precise characterization of the smallest nonzero eigenvalue of the sketched matrix, which may be of independent interest. We then further characterize the second-order equivalence of the sketched pseudoinverse. We also apply our results to the analysis of the sketch-and-project method and to sketched ridge regression. Lastly, we prove that these results generalize to asymptotically free sketching matrices, obtaining the resulting equivalence for orthogonal sketching matrices and comparing our results to several common sketches used in practice.
△ Less
Submitted 6 October, 2023; v1 submitted 7 November, 2022;
originally announced November 2022.
-
A Blessing of Dimensionality in Membership Inference through Regularization
Authors:
Jasper Tan,
Daniel LeJeune,
Blake Mason,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. Howev…
▽ More
Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. However, remarkably, we then show that if coupled with proper regularization, increasing the number of parameters of a model can actually simultaneously increase both its privacy and performance, thereby eliminating the privacy--utility trade-off. Theoretically, we demonstrate this curious phenomenon for logistic regression with ridge regularization in a bi-level feature ensemble setting. Pursuant to our theoretical exploration, we develop a novel leave-one-out analysis tool to precisely characterize the vulnerability of a linear classifier to the optimal membership inference attack. We empirically exhibit this "blessing of dimensionality" for neural networks on a variety of tasks using early stopping as the regularizer.
△ Less
Submitted 13 April, 2023; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference
Authors:
Jasper Tan,
Blake Mason,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overp…
▽ More
A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overparameterization: the fact that overparameterized models may be more vulnerable to privacy attacks, in particular the membership inference attack that predicts the (potentially sensitive) examples used to train a model. We significantly extend the relatively few empirical results on this problem by theoretically proving for an overparameterized linear regression model in the Gaussian data setting that membership inference vulnerability increases with the number of parameters. Moreover, a range of empirical studies indicates that more complex, nonlinear models exhibit the same behavior. Finally, we extend our analysis towards ridge-regularized linear regression and show in the Gaussian data setting that increased regularization also increases membership inference vulnerability in the overparameterized regime.
△ Less
Submitted 30 November, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Adaptive Steganography Based on bargain Game
Authors:
Behbod Keshavarzi,
Hamidreza Navidi,
Parvaneh Asghari,
Seyyed Hamid Haji Seyyed Javadi
Abstract:
The capacity and security of the confidential message on the channel are two important challenges in steganography. In this paper, a new block steganography model is presented using the bargain method so that a competitive model is introduced. In this game, the blocks are the same players. The bargain is provided with the aim of embedding information without reducing capacity as well as increasing…
▽ More
The capacity and security of the confidential message on the channel are two important challenges in steganography. In this paper, a new block steganography model is presented using the bargain method so that a competitive model is introduced. In this game, the blocks are the same players. The bargain is provided with the aim of embedding information without reducing capacity as well as increasing security. The proposed model shows that it can be used both of the special domain and the transform domain, which are two important methods of steganography. For this purpose, an example of a special domain model is introduced in which, In the first step, the image is divided into $n \times n$ blocks, and in the second step using the graph coloring algorithm, pixels are considered to embed confidential information in each block. In the third step, regarding the bargaining method in game theory, each block plays the role of a player, that the competition between players is based on the defined goal function, and in the best blocks in terms of two criteria of capacity and security, which here means each block has a higher security-to-capacity ratio, so it has a higher priority, which is determined based on the bargaining model. Also, information embedded in LSB two bits. An example of a conversion domain method is also shows that security increases without decreasing in capacity. The conclusion is evaluated by three criteria: PSNR, histogram, and $ε-secure$ also, 2000 standard images were evaluated and observed that the proposed method improves the block methods of embedding information.
△ Less
Submitted 10 May, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Binary code optimization
Authors:
Parviz Gharehbagheri,
Sayeed Hamid Haji Sayeed Javadi,
Parvaneh Asghari,
Naser Gharehbagheri
Abstract:
This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords. Therefore, by replacing the new codewords, the binary data becomes another binary data regarding the intended goals. One of these goals is to reduce data size. I…
▽ More
This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords. Therefore, by replacing the new codewords, the binary data becomes another binary data regarding the intended goals. One of these goals is to reduce data size. It means that instead of the original codewords of each binary data, it replaced the Huffman codewords to reduce the data size. One of the features of this method is the result of positive compression for any type of binary data, that is, regardless of the size of the code table, the difference between the original data size and the data size after compression will be greater than or equal to zero. Another important and practical feature of this method is the use of symmetric codewords instead of the suggested codewords in order to create symmetry, reversibility and error resistance properties with two-way decoding.
△ Less
Submitted 2 October, 2021;
originally announced October 2021.
-
The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization
Authors:
Daniel LeJeune,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy t…
▽ More
Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy that adapts to the weights in a monotonic way corresponds to an effective subquadratic regularization penalty, and therefore leads to sparse solutions. We obtain the effective penalties for several popular sparsification strategies, which are remarkably similar to classical penalties commonly used in sparse optimization. Considering variational dropout as a case study, we demonstrate similar empirical behavior between the adaptive dropout method and classical methods on the task of deep network sparsification, validating our theory.
△ Less
Submitted 3 January, 2022; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Providing a hybrid cryptography algorithm for lightweight authentication protocol in RFID with urban traffic usage case
Authors:
V. Chegeni,
H. Haj Seyyed javadi,
M. R Moazami Goudarzi,
A. Rezakhani
Abstract:
Today, the Internet of Things (IoT) is one of the emerging technologies that enable the connection and transfer of information through communication networks. The main idea of the IoT is the widespread presence of objects such as mobile devices, sensors, and RFID. With the increase in traffic volume in urban areas, the existing intelligent urban traffic management system based on IoT can be vital.…
▽ More
Today, the Internet of Things (IoT) is one of the emerging technologies that enable the connection and transfer of information through communication networks. The main idea of the IoT is the widespread presence of objects such as mobile devices, sensors, and RFID. With the increase in traffic volume in urban areas, the existing intelligent urban traffic management system based on IoT can be vital. Therefore, this paper focused on security in urban traffic based on using RFID. In our scheme, RFID tags chose as the purpose of this article. We, in this paper, present a mutual authentication protocol that leads to privacy based on hybrid cryptography. Also, an authentication process with RFID tags is proposed that can be read at high speed. The protocol has attempted to reduce the complexity of computing. At the same time, the proposed method can withstand attacks such as spoofing of tag and reader, tag tracking, and replay attack.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Lightweight Key-Dependent Dynamic S-Boxes based on Hyperelliptic Curve for IoT Devices
Authors:
Parvaneh Asghari,
Seyyed Hamid Haj Seyyed Javadi
Abstract:
Security is one of the main issues in Internet of Things (IoT). Encryption plays a curtail role in making these systems secure. Substitution Box (S-Box) has an effective impact in block encryption methods. Due to the restricted resource capacities of IoT nodes, providing a lightweight S-Box is a challenging problem. This paper presents a key-dependent S-Box using Hyperelliptic curve. The proposed…
▽ More
Security is one of the main issues in Internet of Things (IoT). Encryption plays a curtail role in making these systems secure. Substitution Box (S-Box) has an effective impact in block encryption methods. Due to the restricted resource capacities of IoT nodes, providing a lightweight S-Box is a challenging problem. This paper presents a key-dependent S-Box using Hyperelliptic curve. The proposed S-Box is analytically evaluated using performance criteria including bijection, nonlinearity, strict avalanche effect, and algebraic degree. The evaluation results endorse that the offered S-Box production algorithm is considerably an effective way to generate cryptographic strong S-Box.
△ Less
Submitted 26 February, 2021;
originally announced February 2021.
-
A novel key pre-distribution scheme based on $μ$-PBIBD combinatorial design in the resource-constrained IoT network
Authors:
Akbar Morshed Aski,
Hamid Haj Seyyed Javadi
Abstract:
In a resource-constrained IoT network, end nodes like WSN, RFID, and embedded systems are used which have memory, processing, and energy limitations. One of the key distribution solutions in these types of networks is to use the key pre-distribution scheme, which accomplishes the key distribution operation offline before the resource-constrained devices deployment in the environment. Also, in orde…
▽ More
In a resource-constrained IoT network, end nodes like WSN, RFID, and embedded systems are used which have memory, processing, and energy limitations. One of the key distribution solutions in these types of networks is to use the key pre-distribution scheme, which accomplishes the key distribution operation offline before the resource-constrained devices deployment in the environment. Also, in order to reduce the shared key discovery computing and communication overhead, the use of combinatorial design in key pre-distribution has been proposed as a solution in recent years. In this study, a $μ$-PBIBD combinatorial design is introduced and constructed and the mapping of such design as a key pre-distribution scheme in the resource-constrained IoT network is explained. Through using such key pre-distribution scheme, more keys are obtained for communication between two devices in the IoT network. This means that there will be a maximum of q + 2 keys between the two devices in the network, where q is the prime power, that is, instead of having a common key for a direct secure connection, the two devices can have q + 2 common keys in their key chain. Accordingly, we would increase the resilience of the key pre-distribution scheme compared to the SBIBD, TD, Trade-KP, UKP *, RD * and 2-D $μ$-PBIBD designs.
Keywords: resource-constrained IoT network; combinatorial design; $μ$-PBIBD; resilience.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
The Implicit Regularization of Ordinary Least Squares Ensembles
Authors:
Daniel LeJeune,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear p…
▽ More
Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear predictors, where each individual predictor is fit using ordinary least squares on a random submatrix of the data matrix. We show that, under standard Gaussianity assumptions, when the number of features selected for each predictor is optimally tuned, the asymptotic risk of a large ensemble is equal to the asymptotic ridge regression risk, which is known to be optimal among linear predictors in this setting. In addition to eliciting this implicit regularization that results from subsampling, we also connect this ensemble to the dropout technique used in training deep (neural) networks, another strategy that has been shown to have a ridge-like regularizing effect.
△ Less
Submitted 24 March, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Implicit Rugosity Regularization via Data Augmentation
Authors:
Daniel LeJeune,
Randall Balestriero,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit…
▽ More
Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit or implicit) regularization in these networks is of great importance. In this work, we explore how the oft-used heuristic of \emph{data augmentation} imposes an {\em implicit regularization} penalty of a novel measure of the \emph{rugosity} or "roughness" based on the tangent Hessian of the function fit to the training data.
△ Less
Submitted 10 October, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
False Discovery Rate Control via Debiased Lasso
Authors:
Adel Javanmard,
Hamid Javadi
Abstract:
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples $(p>n)$, we propose a procedure for variables selection and prove that it controls…
▽ More
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples $(p>n)$, we propose a procedure for variables selection and prove that it controls the "directional" false discovery rate (FDR) below a pre-assigned significance level $q\in [0,1]$. We further analyze the statistical power of our framework and show that for designs with subgaussian rows and a common precision matrix $Ω\in\mathbb{R}^{p\times p}$, if the minimum nonzero parameter $θ_{\min}$ satisfies $$\sqrt{n} θ_{\min} - σ\sqrt{2(\max_{i\in [p]}Ω_{ii})\log\left(\frac{2p}{qs_0}\right)} \to \infty\,,$$ then this procedure achieves asymptotic power one. Our framework is built upon the debiasing approach and assumes the standard condition $s_0 = o(\sqrt{n}/(\log p)^2)$, where $s_0$ indicates the number of true positives among the $p$ features. Notably, this framework achieves exact directional FDR control without any assumption on the amplitude of unknown regression parameters, and does not require any knowledge of the distribution of covariates or the noise level. We test our method in synthetic and real data experiments to assess its performance and to corroborate our theoretical results.
△ Less
Submitted 19 March, 2019; v1 submitted 12 March, 2018;
originally announced March 2018.
-
Detecting Community Structure in Dynamic Social Networks Using the Concept of Leadership
Authors:
Saeed Haji Seyed Javadi,
Pedram Gharani,
Shahram Khadivi
Abstract:
Detecting community structure in social networks is a fundamental problem empowering us to identify groups of actors with similar interests. There have been extensive works focusing on finding communities in static networks, however, in reality, due to dynamic nature of social networks, they are evolving continuously. Ignoring the dynamic aspect of social networks, neither allows us to capture evo…
▽ More
Detecting community structure in social networks is a fundamental problem empowering us to identify groups of actors with similar interests. There have been extensive works focusing on finding communities in static networks, however, in reality, due to dynamic nature of social networks, they are evolving continuously. Ignoring the dynamic aspect of social networks, neither allows us to capture evolutionary behavior of the network nor to predict the future status of individuals. Aside from being dynamic, another significant characteristic of real-world social networks is the presence of leaders, i.e. nodes with high degree centrality having a high attraction to absorb other members and hence to form a local community. In this paper, we devised an efficient method to incrementally detect communities in highly dynamic social networks using the intuitive idea of importance and persistence of community leaders over time. Our proposed method is able to find new communities based on the previous structure of the network without recomputing them from scratch. This unique feature, enables us to efficiently detect and track communities over time rapidly. Experimental results on the synthetic and real-world social networks demonstrate that our method is both effective and efficient in discovering communities in dynamic social networks.
△ Less
Submitted 6 November, 2017;
originally announced November 2017.
-
Porcupine Neural Networks: (Almost) All Local Optima are Global
Authors:
Soheil Feizi,
Hamid Javadi,
Jesse Zhang,
David Tse
Abstract:
Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical pro…
▽ More
Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical properties while at the same time, be a good approximation for the unconstrained one. For two-layer neural networks, we provide affirmative answers to these questions by introducing Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.
△ Less
Submitted 5 October, 2017;
originally announced October 2017.
-
Non-negative Matrix Factorization via Archetypal Analysis
Authors:
Hamid Javadi,
Andrea Montanari
Abstract:
Given a collection of data points, non-negative matrix factorization (NMF) suggests to express them as convex combinations of a small set of `archetypes' with non-negative entries. This decomposition is unique only if the true archetypes are non-negative and sufficiently sparse (or the weights are sufficiently sparse), a regime that is captured by the separability condition and its generalizations…
▽ More
Given a collection of data points, non-negative matrix factorization (NMF) suggests to express them as convex combinations of a small set of `archetypes' with non-negative entries. This decomposition is unique only if the true archetypes are non-negative and sufficiently sparse (or the weights are sufficiently sparse), a regime that is captured by the separability condition and its generalizations.
In this paper, we study an approach to NMF that can be traced back to the work of Cutler and Breiman (1994) and does not require the data to be separable, while providing a generally unique decomposition. We optimize the trade-off between two objectives: we minimize the distance of the data points from the convex envelope of the archetypes (which can be interpreted as an empirical risk), while minimizing the distance of the archetypes from the convex envelope of the data (which can be interpreted as a data-dependent regularization). The archetypal analysis method of (Cutler, Breiman, 1994) is recovered as the limiting case in which the last term is given infinite weight.
We introduce a `uniqueness condition' on the data which is necessary for exactly recovering the archetypes from noiseless data. We prove that, under uniqueness (plus additional regularity conditions on the geometry of the archetypes), our estimator is robust. While our approach requires solving a non-convex optimization problem, we find that standard optimization methods succeed in finding good solutions both for real and synthetic data.
△ Less
Submitted 8 May, 2017;
originally announced May 2017.
-
A Statistical Model for Motifs Detection
Authors:
Hamid Javadi,
Andrea Montanari
Abstract:
We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as `motifs.'
Within this model, one or multiple copies of a su…
▽ More
We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as `motifs.'
Within this model, one or multiple copies of a subgraph is added (`planted') in an Erdős-Renyi random graph with $n$ vertices and edge probability $q_0$. We ask whether the resulting graph can be distinguished reliably from a pure Erdős-Renyi random graph, and we present two types of result. First we investigate the question from a purely statistical perspective, and ask whether there is any test that can distinguish between the two graph models. We provide necessary and sufficient conditions that are essentially tight for small enough subgraphs.
Next we study two polynomial-time algorithms for solving the same problem: a spectral algorithm, and a semidefinite programming (SDP) relaxation. For the spectral algorithm, we establish sufficient conditions under which it distinguishes the two graph models with high probability. Under the same conditions the spectral algorithm indeed identifies the hidden subgraph.
The spectral algorithm is substantially sub-optimal with respect to the optimal test. We show that a similar gap is present for the more sophisticated SDP approach.
△ Less
Submitted 20 October, 2017; v1 submitted 16 November, 2015;
originally announced November 2015.