-
CO2 capture using boron, nitrogen, and phosphorus-doped C20 in the present electric field: A DFT study
Authors:
Parham Rezaee,
Shervin Alikhah Asl,
Mohammad Hasan Javadi,
Shahab Rezaee,
Razieh Morad,
Mahmood Akbari,
Seyed Shahriar Arab,
Malik Maaza
Abstract:
Burning fossil fuels emits a significant amount of CO2, causing climate change concerns. CO2 Capture and Storage (CCS) aims to reduce emissions, with fullerenes showing promise as CO2 adsorbents. Recent research focuses on modifying fullerenes using an electric field. In light of this, we carried out DFT studies on some B, N, and P doped C20 (C20-nXn (n = 0, 1, 2, and 3; X = B, N, and P)) in the a…
▽ More
Burning fossil fuels emits a significant amount of CO2, causing climate change concerns. CO2 Capture and Storage (CCS) aims to reduce emissions, with fullerenes showing promise as CO2 adsorbents. Recent research focuses on modifying fullerenes using an electric field. In light of this, we carried out DFT studies on some B, N, and P doped C20 (C20-nXn (n = 0, 1, 2, and 3; X = B, N, and P)) in the absence and presence of an electric field in the range of 0-0.02 a.u.. The cohesive energy was calculated to ensure their thermodynamic stability showing, that despite having lesser cohesive energies than C20, they appear in a favorable range. Moreover, the charge distribution for all structures was depicted using the ESP map. Most importantly, we evaluated the adsorption energy, height, and CO2 angle, demonstrating the B and N-doped fullerenes had the stronger interaction with CO2, which by far exceeded C20's, improving its physisorption to physicochemical adsorption. Although the adsorption energy of P-doped fullerenes was not as satisfactory, in most cases, increasing the electric field led to enhancing CO2 adsorption and incorporating chemical attributes to CO2-fullerene interaction. The HOMO--LUMO plots were obtained by which we discovered that unlike the P-doped C20, the surprising activity of B and N-doped C20s against CO2 originates from a high concentration of the HOMO-LUMO orbitals on B and N atoms. Additionally, the charge distribution for all structures was depicted using the ESP map. In the present article, we attempt to introduce more effective fullerene-based materials for CO2 capture as well as strategies to enhance their efficiency and revealing adsorption nature over B, N, and P-doped fullerenes.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Asymptotics of the Sketched Pseudoinverse
Authors:
Daniel LeJeune,
Pratik Patil,
Hamid Javadi,
Richard G. Baraniuk,
Ryan J. Tibshirani
Abstract:
We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise…
▽ More
We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise characterization of the equivalence even under negative regularization, including a precise characterization of the smallest nonzero eigenvalue of the sketched matrix, which may be of independent interest. We then further characterize the second-order equivalence of the sketched pseudoinverse. We also apply our results to the analysis of the sketch-and-project method and to sketched ridge regression. Lastly, we prove that these results generalize to asymptotically free sketching matrices, obtaining the resulting equivalence for orthogonal sketching matrices and comparing our results to several common sketches used in practice.
△ Less
Submitted 6 October, 2023; v1 submitted 7 November, 2022;
originally announced November 2022.
-
A Blessing of Dimensionality in Membership Inference through Regularization
Authors:
Jasper Tan,
Daniel LeJeune,
Blake Mason,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. Howev…
▽ More
Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. However, remarkably, we then show that if coupled with proper regularization, increasing the number of parameters of a model can actually simultaneously increase both its privacy and performance, thereby eliminating the privacy--utility trade-off. Theoretically, we demonstrate this curious phenomenon for logistic regression with ridge regularization in a bi-level feature ensemble setting. Pursuant to our theoretical exploration, we develop a novel leave-one-out analysis tool to precisely characterize the vulnerability of a linear classifier to the optimal membership inference attack. We empirically exhibit this "blessing of dimensionality" for neural networks on a variety of tasks using early stopping as the regularizer.
△ Less
Submitted 13 April, 2023; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference
Authors:
Jasper Tan,
Blake Mason,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overp…
▽ More
A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overparameterization: the fact that overparameterized models may be more vulnerable to privacy attacks, in particular the membership inference attack that predicts the (potentially sensitive) examples used to train a model. We significantly extend the relatively few empirical results on this problem by theoretically proving for an overparameterized linear regression model in the Gaussian data setting that membership inference vulnerability increases with the number of parameters. Moreover, a range of empirical studies indicates that more complex, nonlinear models exhibit the same behavior. Finally, we extend our analysis towards ridge-regularized linear regression and show in the Gaussian data setting that increased regularization also increases membership inference vulnerability in the overparameterized regime.
△ Less
Submitted 30 November, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Adaptive Steganography Based on bargain Game
Authors:
Behbod Keshavarzi,
Hamidreza Navidi,
Parvaneh Asghari,
Seyyed Hamid Haji Seyyed Javadi
Abstract:
The capacity and security of the confidential message on the channel are two important challenges in steganography. In this paper, a new block steganography model is presented using the bargain method so that a competitive model is introduced. In this game, the blocks are the same players. The bargain is provided with the aim of embedding information without reducing capacity as well as increasing…
▽ More
The capacity and security of the confidential message on the channel are two important challenges in steganography. In this paper, a new block steganography model is presented using the bargain method so that a competitive model is introduced. In this game, the blocks are the same players. The bargain is provided with the aim of embedding information without reducing capacity as well as increasing security. The proposed model shows that it can be used both of the special domain and the transform domain, which are two important methods of steganography. For this purpose, an example of a special domain model is introduced in which, In the first step, the image is divided into $n \times n$ blocks, and in the second step using the graph coloring algorithm, pixels are considered to embed confidential information in each block. In the third step, regarding the bargaining method in game theory, each block plays the role of a player, that the competition between players is based on the defined goal function, and in the best blocks in terms of two criteria of capacity and security, which here means each block has a higher security-to-capacity ratio, so it has a higher priority, which is determined based on the bargaining model. Also, information embedded in LSB two bits. An example of a conversion domain method is also shows that security increases without decreasing in capacity. The conclusion is evaluated by three criteria: PSNR, histogram, and $ε-secure$ also, 2000 standard images were evaluated and observed that the proposed method improves the block methods of embedding information.
△ Less
Submitted 10 May, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Binary code optimization
Authors:
Parviz Gharehbagheri,
Sayeed Hamid Haji Sayeed Javadi,
Parvaneh Asghari,
Naser Gharehbagheri
Abstract:
This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords. Therefore, by replacing the new codewords, the binary data becomes another binary data regarding the intended goals. One of these goals is to reduce data size. I…
▽ More
This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords. Therefore, by replacing the new codewords, the binary data becomes another binary data regarding the intended goals. One of these goals is to reduce data size. It means that instead of the original codewords of each binary data, it replaced the Huffman codewords to reduce the data size. One of the features of this method is the result of positive compression for any type of binary data, that is, regardless of the size of the code table, the difference between the original data size and the data size after compression will be greater than or equal to zero. Another important and practical feature of this method is the use of symmetric codewords instead of the suggested codewords in order to create symmetry, reversibility and error resistance properties with two-way decoding.
△ Less
Submitted 2 October, 2021;
originally announced October 2021.
-
The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization
Authors:
Daniel LeJeune,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy t…
▽ More
Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy that adapts to the weights in a monotonic way corresponds to an effective subquadratic regularization penalty, and therefore leads to sparse solutions. We obtain the effective penalties for several popular sparsification strategies, which are remarkably similar to classical penalties commonly used in sparse optimization. Considering variational dropout as a case study, we demonstrate similar empirical behavior between the adaptive dropout method and classical methods on the task of deep network sparsification, validating our theory.
△ Less
Submitted 3 January, 2022; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Providing a hybrid cryptography algorithm for lightweight authentication protocol in RFID with urban traffic usage case
Authors:
V. Chegeni,
H. Haj Seyyed javadi,
M. R Moazami Goudarzi,
A. Rezakhani
Abstract:
Today, the Internet of Things (IoT) is one of the emerging technologies that enable the connection and transfer of information through communication networks. The main idea of the IoT is the widespread presence of objects such as mobile devices, sensors, and RFID. With the increase in traffic volume in urban areas, the existing intelligent urban traffic management system based on IoT can be vital.…
▽ More
Today, the Internet of Things (IoT) is one of the emerging technologies that enable the connection and transfer of information through communication networks. The main idea of the IoT is the widespread presence of objects such as mobile devices, sensors, and RFID. With the increase in traffic volume in urban areas, the existing intelligent urban traffic management system based on IoT can be vital. Therefore, this paper focused on security in urban traffic based on using RFID. In our scheme, RFID tags chose as the purpose of this article. We, in this paper, present a mutual authentication protocol that leads to privacy based on hybrid cryptography. Also, an authentication process with RFID tags is proposed that can be read at high speed. The protocol has attempted to reduce the complexity of computing. At the same time, the proposed method can withstand attacks such as spoofing of tag and reader, tag tracking, and replay attack.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Lightweight Key-Dependent Dynamic S-Boxes based on Hyperelliptic Curve for IoT Devices
Authors:
Parvaneh Asghari,
Seyyed Hamid Haj Seyyed Javadi
Abstract:
Security is one of the main issues in Internet of Things (IoT). Encryption plays a curtail role in making these systems secure. Substitution Box (S-Box) has an effective impact in block encryption methods. Due to the restricted resource capacities of IoT nodes, providing a lightweight S-Box is a challenging problem. This paper presents a key-dependent S-Box using Hyperelliptic curve. The proposed…
▽ More
Security is one of the main issues in Internet of Things (IoT). Encryption plays a curtail role in making these systems secure. Substitution Box (S-Box) has an effective impact in block encryption methods. Due to the restricted resource capacities of IoT nodes, providing a lightweight S-Box is a challenging problem. This paper presents a key-dependent S-Box using Hyperelliptic curve. The proposed S-Box is analytically evaluated using performance criteria including bijection, nonlinearity, strict avalanche effect, and algebraic degree. The evaluation results endorse that the offered S-Box production algorithm is considerably an effective way to generate cryptographic strong S-Box.
△ Less
Submitted 26 February, 2021;
originally announced February 2021.
-
A novel key pre-distribution scheme based on $μ$-PBIBD combinatorial design in the resource-constrained IoT network
Authors:
Akbar Morshed Aski,
Hamid Haj Seyyed Javadi
Abstract:
In a resource-constrained IoT network, end nodes like WSN, RFID, and embedded systems are used which have memory, processing, and energy limitations. One of the key distribution solutions in these types of networks is to use the key pre-distribution scheme, which accomplishes the key distribution operation offline before the resource-constrained devices deployment in the environment. Also, in orde…
▽ More
In a resource-constrained IoT network, end nodes like WSN, RFID, and embedded systems are used which have memory, processing, and energy limitations. One of the key distribution solutions in these types of networks is to use the key pre-distribution scheme, which accomplishes the key distribution operation offline before the resource-constrained devices deployment in the environment. Also, in order to reduce the shared key discovery computing and communication overhead, the use of combinatorial design in key pre-distribution has been proposed as a solution in recent years. In this study, a $μ$-PBIBD combinatorial design is introduced and constructed and the mapping of such design as a key pre-distribution scheme in the resource-constrained IoT network is explained. Through using such key pre-distribution scheme, more keys are obtained for communication between two devices in the IoT network. This means that there will be a maximum of q + 2 keys between the two devices in the network, where q is the prime power, that is, instead of having a common key for a direct secure connection, the two devices can have q + 2 common keys in their key chain. Accordingly, we would increase the resilience of the key pre-distribution scheme compared to the SBIBD, TD, Trade-KP, UKP *, RD * and 2-D $μ$-PBIBD designs.
Keywords: resource-constrained IoT network; combinatorial design; $μ$-PBIBD; resilience.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Radar networks: A review of features and challenges
Authors:
S. Hamed Javadi,
Alfonso Farina
Abstract:
Networks of multiple radars are typically used for improving the coverage and tracking accuracy. Recently, such networks have facilitated deployment of commercial radars for civilian applications such as healthcare, gesture recognition, home security, and autonomous automobiles. They exploit advanced signal processing techniques together with efficient data fusion methods in order to yield high pe…
▽ More
Networks of multiple radars are typically used for improving the coverage and tracking accuracy. Recently, such networks have facilitated deployment of commercial radars for civilian applications such as healthcare, gesture recognition, home security, and autonomous automobiles. They exploit advanced signal processing techniques together with efficient data fusion methods in order to yield high performance of event detection and tracking. This paper reviews outstanding features of radar networks, their challenges, and their state-of-the-art solutions from the perspective of signal processing. Each discussed subject can be evolved as a hot research topic.
△ Less
Submitted 28 March, 2020;
originally announced March 2020.
-
The Implicit Regularization of Ordinary Least Squares Ensembles
Authors:
Daniel LeJeune,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear p…
▽ More
Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear predictors, where each individual predictor is fit using ordinary least squares on a random submatrix of the data matrix. We show that, under standard Gaussianity assumptions, when the number of features selected for each predictor is optimally tuned, the asymptotic risk of a large ensemble is equal to the asymptotic ridge regression risk, which is known to be optimal among linear predictors in this setting. In addition to eliciting this implicit regularization that results from subsampling, we also connect this ensemble to the dropout technique used in training deep (neural) networks, another strategy that has been shown to have a ridge-like regularizing effect.
△ Less
Submitted 24 March, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Implicit Rugosity Regularization via Data Augmentation
Authors:
Daniel LeJeune,
Randall Balestriero,
Hamid Javadi,
Richard G. Baraniuk
Abstract:
Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit…
▽ More
Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit or implicit) regularization in these networks is of great importance. In this work, we explore how the oft-used heuristic of \emph{data augmentation} imposes an {\em implicit regularization} penalty of a novel measure of the \emph{rugosity} or "roughness" based on the tangent Hessian of the function fit to the training data.
△ Less
Submitted 10 October, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Room Temperature Terahertz Spectrometer with Quantum-Level Sensitivity
Authors:
Ning Wang,
Semih Cakmakyapan,
Yen-Ju Lin,
Hamid Javadi,
Mona Jarrahi
Abstract:
Photon detection with quantum-level sensitivity is particularly challenging in the terahertz regime (0.1-10 THz), which contains ~98% of all the photons existing in the universe. Near-quantum-limited terahertz spectrometry has so far only been possible through the use of cryogenically cooled superconducting mixers as frequency downconverters. Here we introduce a spectrometry scheme that uses plasm…
▽ More
Photon detection with quantum-level sensitivity is particularly challenging in the terahertz regime (0.1-10 THz), which contains ~98% of all the photons existing in the universe. Near-quantum-limited terahertz spectrometry has so far only been possible through the use of cryogenically cooled superconducting mixers as frequency downconverters. Here we introduce a spectrometry scheme that uses plasmonic photomixing for frequency downconversion to offer quantum-level sensitivities at room temperature for the first time. Frequency downconversion is achieved by mixing terahertz radiation and a heterodyning optical beam with a terahertz beat frequency in a plasmonics-enhanced semiconductor active region. We demonstrate spectrometer sensitivities down to 3 times the quantum-limit at room temperature. With a versatile design capable of broadband spectrometry, this plasmonic photomixer has broad applicability to quantum optics, chemical sensing, biological studies, medical diagnosis, high data-rate communication, as well as astronomy and atmospheric studies.
△ Less
Submitted 10 June, 2018;
originally announced June 2018.
-
False Discovery Rate Control via Debiased Lasso
Authors:
Adel Javanmard,
Hamid Javadi
Abstract:
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples $(p>n)$, we propose a procedure for variables selection and prove that it controls…
▽ More
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples $(p>n)$, we propose a procedure for variables selection and prove that it controls the "directional" false discovery rate (FDR) below a pre-assigned significance level $q\in [0,1]$. We further analyze the statistical power of our framework and show that for designs with subgaussian rows and a common precision matrix $Ω\in\mathbb{R}^{p\times p}$, if the minimum nonzero parameter $θ_{\min}$ satisfies $$\sqrt{n} θ_{\min} - σ\sqrt{2(\max_{i\in [p]}Ω_{ii})\log\left(\frac{2p}{qs_0}\right)} \to \infty\,,$$ then this procedure achieves asymptotic power one. Our framework is built upon the debiasing approach and assumes the standard condition $s_0 = o(\sqrt{n}/(\log p)^2)$, where $s_0$ indicates the number of true positives among the $p$ features. Notably, this framework achieves exact directional FDR control without any assumption on the amplitude of unknown regression parameters, and does not require any knowledge of the distribution of covariates or the noise level. We test our method in synthetic and real data experiments to assess its performance and to corroborate our theoretical results.
△ Less
Submitted 19 March, 2019; v1 submitted 12 March, 2018;
originally announced March 2018.
-
An Instability in Variational Inference for Topic Models
Authors:
Behrooz Ghorbani,
Hamid Javadi,
Andrea Montanari
Abstract:
Topic models are Bayesian models that are frequently used to capture the latent structure of certain corpora of documents or images. Each data element in such a corpus (for instance each item in a collection of scientific articles) is regarded as a convex combination of a small number of vectors corresponding to `topics' or `components'. The weights are assumed to have a Dirichlet prior distributi…
▽ More
Topic models are Bayesian models that are frequently used to capture the latent structure of certain corpora of documents or images. Each data element in such a corpus (for instance each item in a collection of scientific articles) is regarded as a convex combination of a small number of vectors corresponding to `topics' or `components'. The weights are assumed to have a Dirichlet prior distribution. The standard approach towards approximating the posterior is to use variational inference algorithms, and in particular a mean field approximation.
We show that this approach suffers from an instability that can produce misleading conclusions. Namely, for certain regimes of the model parameters, variational inference outputs a non-trivial decomposition into topics. However --for the same parameter values-- the data contain no actual information about the true decomposition, and hence the output of the algorithm is uncorrelated with the true topic decomposition. Among other consequences, the estimated posterior mean is significantly wrong, and estimated Bayesian credible regions do not achieve the nominal coverage. We discuss how this instability is remedied by more accurate mean field approximations.
△ Less
Submitted 2 February, 2018;
originally announced February 2018.
-
Detecting Community Structure in Dynamic Social Networks Using the Concept of Leadership
Authors:
Saeed Haji Seyed Javadi,
Pedram Gharani,
Shahram Khadivi
Abstract:
Detecting community structure in social networks is a fundamental problem empowering us to identify groups of actors with similar interests. There have been extensive works focusing on finding communities in static networks, however, in reality, due to dynamic nature of social networks, they are evolving continuously. Ignoring the dynamic aspect of social networks, neither allows us to capture evo…
▽ More
Detecting community structure in social networks is a fundamental problem empowering us to identify groups of actors with similar interests. There have been extensive works focusing on finding communities in static networks, however, in reality, due to dynamic nature of social networks, they are evolving continuously. Ignoring the dynamic aspect of social networks, neither allows us to capture evolutionary behavior of the network nor to predict the future status of individuals. Aside from being dynamic, another significant characteristic of real-world social networks is the presence of leaders, i.e. nodes with high degree centrality having a high attraction to absorb other members and hence to form a local community. In this paper, we devised an efficient method to incrementally detect communities in highly dynamic social networks using the intuitive idea of importance and persistence of community leaders over time. Our proposed method is able to find new communities based on the previous structure of the network without recomputing them from scratch. This unique feature, enables us to efficiently detect and track communities over time rapidly. Experimental results on the synthetic and real-world social networks demonstrate that our method is both effective and efficient in discovering communities in dynamic social networks.
△ Less
Submitted 6 November, 2017;
originally announced November 2017.
-
Porcupine Neural Networks: (Almost) All Local Optima are Global
Authors:
Soheil Feizi,
Hamid Javadi,
Jesse Zhang,
David Tse
Abstract:
Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical pro…
▽ More
Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical properties while at the same time, be a good approximation for the unconstrained one. For two-layer neural networks, we provide affirmative answers to these questions by introducing Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.
△ Less
Submitted 5 October, 2017;
originally announced October 2017.
-
Non-negative Matrix Factorization via Archetypal Analysis
Authors:
Hamid Javadi,
Andrea Montanari
Abstract:
Given a collection of data points, non-negative matrix factorization (NMF) suggests to express them as convex combinations of a small set of `archetypes' with non-negative entries. This decomposition is unique only if the true archetypes are non-negative and sufficiently sparse (or the weights are sufficiently sparse), a regime that is captured by the separability condition and its generalizations…
▽ More
Given a collection of data points, non-negative matrix factorization (NMF) suggests to express them as convex combinations of a small set of `archetypes' with non-negative entries. This decomposition is unique only if the true archetypes are non-negative and sufficiently sparse (or the weights are sufficiently sparse), a regime that is captured by the separability condition and its generalizations.
In this paper, we study an approach to NMF that can be traced back to the work of Cutler and Breiman (1994) and does not require the data to be separable, while providing a generally unique decomposition. We optimize the trade-off between two objectives: we minimize the distance of the data points from the convex envelope of the archetypes (which can be interpreted as an empirical risk), while minimizing the distance of the archetypes from the convex envelope of the data (which can be interpreted as a data-dependent regularization). The archetypal analysis method of (Cutler, Breiman, 1994) is recovered as the limiting case in which the last term is given infinite weight.
We introduce a `uniqueness condition' on the data which is necessary for exactly recovering the archetypes from noiseless data. We prove that, under uniqueness (plus additional regularity conditions on the geometry of the archetypes), our estimator is robust. While our approach requires solving a non-convex optimization problem, we find that standard optimization methods succeed in finding good solutions both for real and synthetic data.
△ Less
Submitted 8 May, 2017;
originally announced May 2017.
-
Preconditioning via Diagonal Scaling
Authors:
Reza Takapoui,
Hamid Javadi
Abstract:
Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time. However, for larger problems as well as stochastic problems, one needs to use first-order methods such as stochastic gradient descent (SGD), the alternating direction method of multipliers (ADMM), and conjugate gradient (CG) in order to attain a modest accuracy in a reasonable number of ite…
▽ More
Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time. However, for larger problems as well as stochastic problems, one needs to use first-order methods such as stochastic gradient descent (SGD), the alternating direction method of multipliers (ADMM), and conjugate gradient (CG) in order to attain a modest accuracy in a reasonable number of iterations. In this report, we first discuss heuristics for diagonal scaling. Next, we motivate preconditioning by an example, and then we study preconditioning for a specific splitting form in ADMM called graph projection splitting. Finally we examine the performance of our methods by some numerical examples.
△ Less
Submitted 12 October, 2016;
originally announced October 2016.
-
A Statistical Model for Motifs Detection
Authors:
Hamid Javadi,
Andrea Montanari
Abstract:
We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as `motifs.'
Within this model, one or multiple copies of a su…
▽ More
We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as `motifs.'
Within this model, one or multiple copies of a subgraph is added (`planted') in an Erdős-Renyi random graph with $n$ vertices and edge probability $q_0$. We ask whether the resulting graph can be distinguished reliably from a pure Erdős-Renyi random graph, and we present two types of result. First we investigate the question from a purely statistical perspective, and ask whether there is any test that can distinguish between the two graph models. We provide necessary and sufficient conditions that are essentially tight for small enough subgraphs.
Next we study two polynomial-time algorithms for solving the same problem: a spectral algorithm, and a semidefinite programming (SDP) relaxation. For the spectral algorithm, we establish sufficient conditions under which it distinguishes the two graph models with high probability. Under the same conditions the spectral algorithm indeed identifies the hidden subgraph.
The spectral algorithm is substantially sub-optimal with respect to the optimal test. We show that a similar gap is present for the more sophisticated SDP approach.
△ Less
Submitted 20 October, 2017; v1 submitted 16 November, 2015;
originally announced November 2015.
-
On a generalization of McCoy Rings
Authors:
Mohammad Vahdani Mehrabadi,
Shervin Sahebi,
Hamid H. S. Javadi
Abstract:
We introduce Central McCoy rings, which are a generalization of McCoy rings and investigate their properties. For a ring R, we prove that R is right Central McCoy if and only if the polynomial ring R[x] is right Central McCoy. Also, we give some examples to show that if R is right Central McCoy, then Mn(R) and Tn(R) are not necessary right Central McCoy, but Dn(R) and Vn(R) are right Central McCoy…
▽ More
We introduce Central McCoy rings, which are a generalization of McCoy rings and investigate their properties. For a ring R, we prove that R is right Central McCoy if and only if the polynomial ring R[x] is right Central McCoy. Also, we give some examples to show that if R is right Central McCoy, then Mn(R) and Tn(R) are not necessary right Central McCoy, but Dn(R) and Vn(R) are right Central McCoy, where Dn(R) and Vn(R) are the subrings of the triangular matrices with constant main diagonal and constant main diagonals, respectively.
△ Less
Submitted 13 October, 2014;
originally announced October 2014.
-
On a generalization of NC-McCoy Rings
Authors:
Mohammad Vahdani Mehrabadi,
Shervin Sahebi,
Hamid H. S. Javadi
Abstract:
In the present paper we concentrate on a natural generalization of NC-McCoy rings that is called J-McCoy and investigate their properties. We prove that local rings are J-McCoy. Also, for an abelian ring R, we show that R is J-McCoy if and only if eR is J-McCoy, where e is an idempotent element of R. Moreover, we give an example to show that the J-McCoy property does not pass Mn(R), but S(R; n);A(…
▽ More
In the present paper we concentrate on a natural generalization of NC-McCoy rings that is called J-McCoy and investigate their properties. We prove that local rings are J-McCoy. Also, for an abelian ring R, we show that R is J-McCoy if and only if eR is J-McCoy, where e is an idempotent element of R. Moreover, we give an example to show that the J-McCoy property does not pass Mn(R), but S(R; n);A(R; n);B(R; n) and T(R; n) are J-McCoy
△ Less
Submitted 12 October, 2014;
originally announced October 2014.
-
On A Generalization of Weak Armendariz Rings
Authors:
Mahboubeh Sanaei,
Shervin Sahebi,
Hamid H. S. Javadi
Abstract:
We introduce the notion of J-Armendariz rings, which are a generalization of weak Armendariz rings and investigate their properties. We show that any local ring is J-Armendariz, and then fined a local ring that is not weak Armendariz.
We introduce the notion of J-Armendariz rings, which are a generalization of weak Armendariz rings and investigate their properties. We show that any local ring is J-Armendariz, and then fined a local ring that is not weak Armendariz.
△ Less
Submitted 6 October, 2014;
originally announced October 2014.
-
On Central Skew Armendariz rings
Authors:
Mahboubeh Sanaei,
Shervin Sahebi,
Hamid H. S. Javadi
Abstract:
For a ring endomirphism, we introduce the central skew Armendariz rings, which are a generalization of skew Armendariz rings and central Armendariz rings, and investigate their properties.
For a ring endomirphism, we introduce the central skew Armendariz rings, which are a generalization of skew Armendariz rings and central Armendariz rings, and investigate their properties.
△ Less
Submitted 29 September, 2014;
originally announced September 2014.
-
Water vapor toward starless cores: the Herschel view
Authors:
P. Caselli,
E. Keto,
L. Pagani,
Y. Aikawa,
U. A. Yildiz,
F. F. S. van der Tak,
M. Tafalla,
E. A. Bergin,
B. Nisini,
C. Codella,
E. F. van Dishoeck,
R. Bachiller,
A. Baudry,
M. Benedettini,
A. O. Benz,
P. Bjerkeli,
G. A. Blake,
S. Bontemps,
J. Braine,
S. Bruderer,
J. Cernicharo,
F. Daniel,
A. M. di Giorgio,
C. Dominik,
S. D. Doty
, et al. (41 additional authors not shown)
Abstract:
SWAS and Odin provided stringent upper limits on the gas phase water abundance of dark clouds (x(H2O) < 7x10^-9). We investigate the chemistry of water vapor in starless cores beyond the previous upper limits using the highly improved angular resolution and sensitivity of Herschel and measure the abundance of water vapor during evolutionary stages just preceding star formation. High spectral resol…
▽ More
SWAS and Odin provided stringent upper limits on the gas phase water abundance of dark clouds (x(H2O) < 7x10^-9). We investigate the chemistry of water vapor in starless cores beyond the previous upper limits using the highly improved angular resolution and sensitivity of Herschel and measure the abundance of water vapor during evolutionary stages just preceding star formation. High spectral resolution observations of the fundamental ortho water (o-H2O) transition (557 GHz) were carried out with Herschel HIFI toward two starless cores: B68, a Bok globule, and L1544, a prestellar core embedded in the Taurus molecular cloud complex. The rms in the brightness temperature measured for the B68 and L1544 spectra is 2.0 and 2.2 mK, respectively, in a velocity bin of 0.59 km s^-1. The continuum level is 3.5+/-0.2 mK in B68 and 11.4+/-0.4 mK in L1544. No significant feature is detected in B68 and the 3 sigma upper limit is consistent with a column density of o-H2O N(o-H2O) < 2.5x10^13 cm^-2, or a fractional abundance x(o-H2O) < 1.3x10^-9, more than an order of magnitude lower than the SWAS upper limit on this source. The L1544 spectrum shows an absorption feature at a 5 sigma level from which we obtain the first value of the o-H2O column density ever measured in dark clouds: N(o-H2O) = (8+/-4)x10^12 cm^-2. The corresponding fractional abundance is x(o-H2O) ~ 5x10^-9 at radii > 7000 AU and ~2x10^-10 toward the center. The radiative transfer analysis shows that this is consistent with a x(o-H2O) profile peaking at ~10^-8, 0.1 pc away from the core center, where both freeze-out and photodissociation are negligible. Herschel has provided the first measurement of water vapor in dark regions. Prestellar cores such as L1544 (with their high central densities, strong continuum, and large envelopes) are very promising tools to finally shed light on the solid/vapor balance of water in molecular clouds.
△ Less
Submitted 7 July, 2010;
originally announced July 2010.
-
Herschel-PACS spectroscopy of the intermediate mass protostar NGC7129 FIRS 2
Authors:
M. Fich,
D. Johnstone,
T. A. van Kempen,
C. McCoey,
A. Fuente,
P. Caselli,
L. E. Kristensen,
R. Plume,
J. Cernicharo,
G. J. Herczeg,
E. F. van Dishoeck,
S. Wampfler,
P. Gaufre,
J. J. Gill,
H. Javadi,
M. Justen,
W. Laauwen,
W. Luinge,
V. Ossenkopf,
J. Pearson,
R. Bachiller,
A. Baudry,
M. Benedettini,
E. Bergin,
A. O. Benz
, et al. (36 additional authors not shown)
Abstract:
Aims: We present preliminary results of the first Herschel spectroscopic observations of NGC7129 FIRS2, an intermediate mass star-forming region. We attempt to interpret the observations in the framework of an in-falling spherical envelope. Methods: The PACS instrument was used in line spectroscopy mode (R=1000-5000) with 15 spectral bands between 63 and 185 microns. This provided good detections…
▽ More
Aims: We present preliminary results of the first Herschel spectroscopic observations of NGC7129 FIRS2, an intermediate mass star-forming region. We attempt to interpret the observations in the framework of an in-falling spherical envelope. Methods: The PACS instrument was used in line spectroscopy mode (R=1000-5000) with 15 spectral bands between 63 and 185 microns. This provided good detections of 26 spectral lines seen in emission, including lines of H2O, CO, OH, O I, and C II. Results: Most of the detected lines, particularly those of H2O and CO, are substantially stronger than predicted by the spherical envelope models, typically by several orders of magnitude. In this paper we focus on what can be learned from the detected CO emission lines. Conclusions: It is unlikely that the much stronger than expected line emission arises in the (spherical) envelope of the YSO. The region hot enough to produce such high excitation lines within such an envelope is too small to produce the amount of emission observed. Virtually all of this high excitation emission must arise in structures such as as along the walls of the outflow cavity with the emission produced by a combination of UV photon heating and/or non-dissociative shocks.
△ Less
Submitted 1 June, 2010;
originally announced June 2010.