-
Faster Spectral Density Estimation and Sparsification in the Nuclear Norm
Authors:
Yujia Jin,
Ishani Karmarkar,
Christopher Musco,
Aaron Sidford,
Apoorv Vikram Singh
Abstract:
We consider the problem of estimating the spectral density of the normalized adjacency matrix of an $n$-node undirected graph. We provide a randomized algorithm that, with $O(nε^{-2})$ queries to a degree and neighbor oracle and in $O(nε^{-3})$ time, estimates the spectrum up to $ε$ accuracy in the Wasserstein-1 metric. This improves on previous state-of-the-art methods, including an $O(nε^{-7})$…
▽ More
We consider the problem of estimating the spectral density of the normalized adjacency matrix of an $n$-node undirected graph. We provide a randomized algorithm that, with $O(nε^{-2})$ queries to a degree and neighbor oracle and in $O(nε^{-3})$ time, estimates the spectrum up to $ε$ accuracy in the Wasserstein-1 metric. This improves on previous state-of-the-art methods, including an $O(nε^{-7})$ time algorithm from [Braverman et al., STOC 2022] and, for sufficiently small $ε$, a $2^{O(ε^{-1})}$ time method from [Cohen-Steiner et al., KDD 2018]. To achieve this result, we introduce a new notion of graph sparsification, which we call nuclear sparsification. We provide an $O(nε^{-2})$-query and $O(nε^{-2})$-time algorithm for computing $O(nε^{-2})$-sparse nuclear sparsifiers. We show that this bound is optimal in both its sparsity and query complexity, and we separate our results from the related notion of additive spectral sparsification. Of independent interest, we show that our sparsification method also yields the first deterministic algorithm for spectral density estimation that scales linearly with $n$ (sublinear in the representation size of the graph).
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Mean Estimation with User-Level Privacy for Spatio-Temporal IoT Datasets
Authors:
V. Arvind Rameshwar,
Anshoo Tandon,
Prajjwal Gupta,
Aditya Vikram Singh,
Novoneel Chakraborty,
Abhay Sharma
Abstract:
This paper considers the problem of the private release of sample means of speed values from traffic datasets. Our key contribution is the development of user-level differentially private algorithms that incorporate carefully chosen parameter values to ensure low estimation errors on real-world datasets, while ensuring privacy. We test our algorithms on ITMS (Intelligent Traffic Management System)…
▽ More
This paper considers the problem of the private release of sample means of speed values from traffic datasets. Our key contribution is the development of user-level differentially private algorithms that incorporate carefully chosen parameter values to ensure low estimation errors on real-world datasets, while ensuring privacy. We test our algorithms on ITMS (Intelligent Traffic Management System) data from an Indian city, where the speeds of different buses are drawn in a potentially non-i.i.d. manner from an unknown distribution, and where the number of speed samples contributed by different buses is potentially different. We then apply our algorithms to large synthetic datasets, generated based on the ITMS data. Here, we provide theoretical justification for the observed performance trends, and also provide recommendations for the choices of algorithm subroutines that result in low estimation errors. Finally, we characterize the best performance of pseudo-user creation-based algorithms on worst-case datasets via a minimax approach; this then gives rise to a novel procedure for the creation of pseudo-users, which optimizes the worst-case total estimation error. The algorithms discussed in the paper are readily applicable to general spatio-temporal IoT datasets for releasing a differentially private mean of a desired value.
△ Less
Submitted 25 April, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Moments, Random Walks, and Limits for Spectrum Approximation
Authors:
Yujia Jin,
Christopher Musco,
Aaron Sidford,
Apoorv Vikram Singh
Abstract:
We study lower bounds for the problem of approximating a one dimensional distribution given (noisy) measurements of its moments. We show that there are distributions on $[-1,1]$ that cannot be approximated to accuracy $ε$ in Wasserstein-1 distance even if we know \emph{all} of their moments to multiplicative accuracy $(1\pm2^{-Ω(1/ε)})$; this result matches an upper bound of Kong and Valiant [Anna…
▽ More
We study lower bounds for the problem of approximating a one dimensional distribution given (noisy) measurements of its moments. We show that there are distributions on $[-1,1]$ that cannot be approximated to accuracy $ε$ in Wasserstein-1 distance even if we know \emph{all} of their moments to multiplicative accuracy $(1\pm2^{-Ω(1/ε)})$; this result matches an upper bound of Kong and Valiant [Annals of Statistics, 2017]. To obtain our result, we provide a hard instance involving distributions induced by the eigenvalue spectra of carefully constructed graph adjacency matrices. Efficiently approximating such spectra in Wasserstein-1 distance is a well-studied algorithmic problem, and a recent result of Cohen-Steiner et al. [KDD 2018] gives a method based on accurately approximating spectral moments using $2^{O(1/ε)}$ random walks initiated at uniformly random nodes in the graph.
As a strengthening of our main result, we show that improving the dependence on $1/ε$ in this result would require a new algorithmic approach. Specifically, no algorithm can compute an $ε$-accurate approximation to the spectrum of a normalized graph adjacency matrix with constant probability, even when given the transcript of $2^{Ω(1/ε)}$ random walks of length $2^{Ω(1/ε)}$ started at random nodes.
△ Less
Submitted 2 July, 2023;
originally announced July 2023.
-
Continual Mean Estimation Under User-Level Privacy
Authors:
Anand Jerry George,
Lekshmi Ramesh,
Aditya Vikram Singh,
Himanshu Tyagi
Abstract:
We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come…
▽ More
We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant $t$ such that the overall release is user-level $\varepsilon$-DP and has the following error guarantee: Denoting by $M_t$ the maximum number of samples contributed by a user, as long as $\tildeΩ(1/\varepsilon)$ users have $M_t/2$ samples each, the error at time $t$ is $\tilde{O}(1/\sqrt{t}+\sqrt{M}_t/t\varepsilon)$. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Automatic Segmentation of the Great Arteries for Computational Hemodynamic Assessment
Authors:
Javier Montalt-Tordera,
Endrit Pajaziti,
Rod Jones,
Emilie Sauvage,
Rajesh Puranik,
Aakansha Ajay Vir Singh,
Claudio Capelli,
Jennifer Steeden,
Silvia Schievano,
Vivek Muthurangu
Abstract:
Background: Computational fluid dynamics (CFD) is increasingly used to assess blood flow conditions in patients with congenital heart disease (CHD). This requires patient-specific anatomy, usually obtained from segmented 3D cardiovascular magnetic resonance (CMR) images. However, segmentation is time-consuming and needs expert input. This study aims to develop and validate a machine learning (ML)…
▽ More
Background: Computational fluid dynamics (CFD) is increasingly used to assess blood flow conditions in patients with congenital heart disease (CHD). This requires patient-specific anatomy, usually obtained from segmented 3D cardiovascular magnetic resonance (CMR) images. However, segmentation is time-consuming and needs expert input. This study aims to develop and validate a machine learning (ML) method for segmentation of the aorta and pulmonary arteries (PAs) for CFD studies.
Methods: 90 CHD patients were retrospectively selected for this study. 3D CMR images were manually segmented to obtain ground-truth (GT) background, aorta and PA labels. These were used to train and optimize a U-Net model. Segmentation performance was primarily evaluated using Dice score. CFD simulations were set up from GT and ML segmentations using a semi-automatic meshing and simulation pipeline. Pressure and velocity fields were computed, and a mean average percentage error (MAPE) was calculated for each vessel pair. A secondary observer (SO) segmented the test dataset to assess inter-observer variability. Friedman tests were used to compare segmentation metrics and flow field errors.
Results: The model's Dice score (ML vs GT) was 0.945 for the aorta and 0.885 for the PAs. Differences with the inter-observer Dice score (SO vs GT) and ML vs SO Dice scores were not statistically significant for either aorta or PAs. The ML vs GT MAPEs for pressure and velocity in the aorta were 10.1% and 4.1% respectively, and for the PAs 14.6% and 6.3%, respectively. Inter-observer (SO vs GT) and ML vs SO pressure and velocity MAPEs were of a similar magnitude to ML vs GT.
Conclusions: The proposed method can successfully segment the great vessels for CFD, with errors similar to inter-observer variability. This reduces the time and effort needed for CFD analysis, making it more attractive for routine clinical use.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
COGMEN: COntextualized GNN based Multimodal Emotion recognitioN
Authors:
Abhinav Joshi,
Ashwani Bhat,
Ayush Jain,
Atin Vikram Singh,
Ashutosh Modi
Abstract:
Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person's emotions are influenced by the other speaker's utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multimod…
▽ More
Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person's emotions are influenced by the other speaker's utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multimodal Emotion recognitioN (COGMEN) system that leverages local information (i.e., inter/intra dependency between speakers) and global information (context). The proposed model uses Graph Neural Network (GNN) based architecture to model the complex dependencies (local and global information) in a conversation. Our model gives state-of-the-art (SOTA) results on IEMOCAP and MOSEI datasets, and detailed ablation experiments show the importance of modeling information at both levels.
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
Explicitising The Implicit Intrepretability of Deep Neural Networks Via Duality
Authors:
Chandrashekar Lakshminarayanan,
Amit Vikram Singh,
Arun Rajkumar
Abstract:
Recent work by Lakshminarayanan and Singh [2020] provided a dual view for fully connected deep neural networks (DNNs) with rectified linear units (ReLU). It was shown that (i) the information in the gates is analytically characterised by a kernel called the neural path kernel (NPK) and (ii) most critical information is learnt in the gates, in that, given the learnt gates, the weights can be retrai…
▽ More
Recent work by Lakshminarayanan and Singh [2020] provided a dual view for fully connected deep neural networks (DNNs) with rectified linear units (ReLU). It was shown that (i) the information in the gates is analytically characterised by a kernel called the neural path kernel (NPK) and (ii) most critical information is learnt in the gates, in that, given the learnt gates, the weights can be retrained from scratch without significant loss in performance. Using the dual view, in this paper, we rethink the conventional interpretations of DNNs thereby explicitsing the implicit interpretability of DNNs. Towards this, we first show new theoretical properties namely rotational invariance and ensemble structure of the NPK in the presence of convolutional layers and skip connections respectively. Our theory leads to two surprising empirical results that challenge conventional wisdom: (i) the weights can be trained even with a constant 1 input, (ii) the gating masks can be shuffled, without any significant loss in performance. These results motivate a novel class of networks which we call deep linearly gated networks (DLGNs). DLGNs using the phenomenon of dual lifting pave way to more direct and simpler interpretation of DNNs as opposed to conventional interpretations. We show via extensive experiments on CIFAR-10 and CIFAR-100 that these DLGNs lead to much better interpretability-accuracy tradeoff.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.
-
Disentangling deep neural networks with rectified linear units using duality
Authors:
Chandrashekar Lakshminarayanan,
Amit Vikram Singh
Abstract:
Despite their success deep neural networks (DNNs) are still largely considered as black boxes. The main issue is that the linear and non-linear operations are entangled in every layer, making it hard to interpret the hidden layer outputs. In this paper, we look at DNNs with rectified linear units (ReLUs), and focus on the gating property (`on/off' states) of the ReLUs. We extend the recently devel…
▽ More
Despite their success deep neural networks (DNNs) are still largely considered as black boxes. The main issue is that the linear and non-linear operations are entangled in every layer, making it hard to interpret the hidden layer outputs. In this paper, we look at DNNs with rectified linear units (ReLUs), and focus on the gating property (`on/off' states) of the ReLUs. We extend the recently developed dual view in which the computation is broken path-wise to show that learning in the gates is more crucial, and learning the weights given the gates is characterised analytically via the so called neural path kernel (NPK) which depends on inputs and gates. In this paper, we present novel results to show that convolution with global pooling and skip connection provide respectively rotational invariance and ensemble structure to the NPK. To address `black box'-ness, we propose a novel interpretable counterpart of DNNs with ReLUs namely deep linearly gated networks (DLGN): the pre-activations to the gates are generated by a deep linear network, and the gates are then applied as external masks to learn the weights in a different network. The DLGN is not an alternative architecture per se, but a disentanglement and an interpretable re-arrangement of the computations in a DNN with ReLUs. The DLGN disentangles the computations into two `mathematically' interpretable linearities (i) the `primal' linearity between the input and the pre-activations in the gating network and (ii) the `dual' linearity in the path space in the weights network characterised by the NPK. We compare the performance of DNN, DGN and DLGN on CIFAR-10 and CIFAR-100 to show that, the DLGN recovers more than $83.5\%$ of the performance of state-of-the-art DNNs. This brings us to an interesting question: `Is DLGN a universal spectral approximator?'
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
DynG2G: An Efficient Stochastic Graph Embedding Method for Temporal Graphs
Authors:
Mengjia Xu,
Apoorva Vikram Singh,
George Em Karniadakis
Abstract:
Dynamic graph embedding has gained great attention recently due to its capability of learning low dimensional graph representations for complex temporal graphs with high accuracy. However, recent advances mostly focus on learning node embeddings as deterministic "vectors" for static graphs yet disregarding the key graph temporal dynamics and the evolving uncertainties associated with node embeddin…
▽ More
Dynamic graph embedding has gained great attention recently due to its capability of learning low dimensional graph representations for complex temporal graphs with high accuracy. However, recent advances mostly focus on learning node embeddings as deterministic "vectors" for static graphs yet disregarding the key graph temporal dynamics and the evolving uncertainties associated with node embedding in the latent space. In this work, we propose an efficient stochastic dynamic graph embedding method (DynG2G) that applies an inductive feed-forward encoder trained with node triplet-based contrastive loss. Every node per timestamp is encoded as a time-dependent probabilistic multivariate Gaussian distribution in the latent space, hence we can quantify the node embedding uncertainty on-the-fly. We adopted eight different benchmarks that represent diversity in size (from 96 nodes to 87,626 and from 13,398 edges to 4,870,863) and diversity in dynamics. We demonstrate via extensive experiments on these eight dynamic graph benchmarks that DynG2G achieves new state-of-the-art performance in capturing the underlying temporal node embeddings. We also demonstrate that DynG2G can predict the evolving node embedding uncertainty, which plays a crucial role in quantifying the intrinsic dimensionality of the dynamical system over time. We obtain a universal relation of the optimal embedding dimension, $L_o$, versus the effective dimensionality of uncertainty, $D_u$, and we infer that $L_o=D_u$ for all cases. This implies that the uncertainty quantification approach we employ in the DynG2G correctly captures the intrinsic dimensionality of the dynamics of such evolving graphs despite the diverse nature and composition of the graphs at each timestamp. Moreover, this $L_0 - D_u$ correlation provides a clear path to select adaptively the optimum embedding size at each timestamp by setting $L \ge D_u$.
△ Less
Submitted 27 April, 2022; v1 submitted 27 September, 2021;
originally announced September 2021.
-
Optimal Rates for Nonparametric Density Estimation under Communication Constraints
Authors:
Jayadev Acharya,
Clément L. Canonne,
Aditya Vikram Singh,
Himanshu Tyagi
Abstract:
We consider density estimation for Besov spaces when each sample is quantized to only a limited number of bits. We provide a noninteractive adaptive estimator that exploits the sparsity of wavelet bases, along with a simulate-and-infer technique from parametric estimation under communication constraints. We show that our estimator is nearly rate-optimal by deriving minimax lower bounds that hold e…
▽ More
We consider density estimation for Besov spaces when each sample is quantized to only a limited number of bits. We provide a noninteractive adaptive estimator that exploits the sparsity of wavelet bases, along with a simulate-and-infer technique from parametric estimation under communication constraints. We show that our estimator is nearly rate-optimal by deriving minimax lower bounds that hold even when interactive protocols are allowed. Interestingly, while our wavelet-based estimator is almost rate-optimal for Sobolev spaces as well, it is unclear whether the standard Fourier basis, which arise naturally for those spaces, can be used to achieve the same performance.
△ Less
Submitted 21 July, 2021;
originally announced July 2021.
-
Regularized spectral methods for clustering signed networks
Authors:
Mihai Cucuringu,
Apoorv Vikram Singh,
Déborah Sulem,
Hemant Tyagi
Abstract:
We study the problem of $k$-way clustering in signed graphs. Considerable attention in recent years has been devoted to analyzing and modeling signed graphs, where the affinity measure between nodes takes either positive or negative values. Recently, Cucuringu et al. [CDGT 2019] proposed a spectral method, namely SPONGE (Signed Positive over Negative Generalized Eigenproblem), which casts the clus…
▽ More
We study the problem of $k$-way clustering in signed graphs. Considerable attention in recent years has been devoted to analyzing and modeling signed graphs, where the affinity measure between nodes takes either positive or negative values. Recently, Cucuringu et al. [CDGT 2019] proposed a spectral method, namely SPONGE (Signed Positive over Negative Generalized Eigenproblem), which casts the clustering task as a generalized eigenvalue problem optimizing a suitably defined objective function. This approach is motivated by social balance theory, where the clustering task aims to decompose a given network into disjoint groups, such that individuals within the same group are connected by as many positive edges as possible, while individuals from different groups are mainly connected by negative edges. Through extensive numerical simulations, SPONGE was shown to achieve state-of-the-art empirical performance. On the theoretical front, [CDGT 2019] analyzed SPONGE and the popular Signed Laplacian method under the setting of a Signed Stochastic Block Model (SSBM), for $k=2$ equal-sized clusters, in the regime where the graph is moderately dense.
In this work, we build on the results in [CDGT 2019] on two fronts for the normalized versions of SPONGE and the Signed Laplacian. Firstly, for both algorithms, we extend the theoretical analysis in [CDGT 2019] to the general setting of $k \geq 2$ unequal-sized clusters in the moderately dense regime. Secondly, we introduce regularized versions of both methods to handle sparse graphs -- a regime where standard spectral methods underperform -- and provide theoretical guarantees under the same SSBM model. To the best of our knowledge, regularized spectral methods have so far not been considered in the setting of clustering signed graphs. We complement our theoretical results with an extensive set of numerical experiments on synthetic data.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
Seq2Seq and Joint Learning Based Unix Command Line Prediction System
Authors:
Thoudam Doren Singh,
Abdullah Faiz Ur Rahman Khilji,
Divyansha,
Apoorva Vikram Singh,
Surmila Thokchom,
Sivaji Bandyopadhyay
Abstract:
Despite being an open-source operating system pioneered in the early 90s, UNIX based platforms have not been able to garner an overwhelming reception from amateur end users. One of the rationales for under popularity of UNIX based systems is the steep learning curve corresponding to them due to extensive use of command line interface instead of usual interactive graphical user interface. In past y…
▽ More
Despite being an open-source operating system pioneered in the early 90s, UNIX based platforms have not been able to garner an overwhelming reception from amateur end users. One of the rationales for under popularity of UNIX based systems is the steep learning curve corresponding to them due to extensive use of command line interface instead of usual interactive graphical user interface. In past years, the majority of insights used to explore the concern are eminently centered around the notion of utilizing chronic log history of the user to make the prediction of successive command. The approaches directed at anatomization of this notion are predominantly in accordance with Probabilistic inference models. The techniques employed in past, however, have not been competent enough to address the predicament as legitimately as anticipated. Instead of deploying usual mechanism of recommendation systems, we have employed a simple yet novel approach of Seq2seq model by leveraging continuous representations of self-curated exhaustive Knowledge Base (KB) to enhance the embedding employed in the model. This work describes an assistive, adaptive and dynamic way of enhancing UNIX command line prediction systems. Experimental methods state that our model has achieved accuracy surpassing mixture of other techniques and adaptive command line interface mechanism as acclaimed in the past.
△ Less
Submitted 20 June, 2020;
originally announced June 2020.
-
Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning
Authors:
Chandrashekar Lakshminarayanan,
Amit Vikram Singh
Abstract:
Rectified linear unit (ReLU) activations can also be thought of as 'gates', which, either pass or stop their pre-activation input when they are 'on' (when the pre-activation input is positive) or 'off' (when the pre-activation input is negative) respectively. A deep neural network (DNN) with ReLU activations has many gates, and the on/off status of each gate changes across input examples as well a…
▽ More
Rectified linear unit (ReLU) activations can also be thought of as 'gates', which, either pass or stop their pre-activation input when they are 'on' (when the pre-activation input is positive) or 'off' (when the pre-activation input is negative) respectively. A deep neural network (DNN) with ReLU activations has many gates, and the on/off status of each gate changes across input examples as well as network weights. For a given input example, only a subset of gates are 'active', i.e., on, and the sub-network of weights connected to these active gates is responsible for producing the output. At randomised initialisation, the active sub-network corresponding to a given input example is random. During training, as the weights are learnt, the active sub-networks are also learnt, and potentially hold very valuable information. In this paper, we analytically characterise the role of active sub-networks in deep learning. To this end, we encode the on/off state of the gates of a given input in a novel 'neural path feature' (NPF), and the weights of the DNN are encoded in a novel 'neural path value' (NPV). Further, we show that the output of network is indeed the inner product of NPF and NPV. The main result of the paper shows that the 'neural path kernel' associated with the NPF is a fundamental quantity that characterises the information stored in the gates of a DNN. We show via experiments (on MNIST and CIFAR-10) that in standard DNNs with ReLU activations NPFs are learnt during training and such learning is key for generalisation. Furthermore, NPFs and NPVs can be learnt in two separate networks and such learning also generalises well in experiments.
△ Less
Submitted 12 June, 2021; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Enhancement in Thermally Generated Spin Voltage at Pd/NiFe$_2$O$_4$ Interfaces by the Growth on Lattice-Matched Substrates
Authors:
A. Rastogi,
Z. Li,
A. V. Singh,
S. Regmi,
T. Peters,
P. Bougiatioti,
D. Carsten né Meier,
J. B. Mohammadi,
B. Khodadadi,
T. Mewes,
R. Mishra,
J. Gazquez,
A. Y. Borisevich,
Z. Galazka,
R. Uecker,
G. Reiss,
T. Kuschel,
A. Gupta
Abstract:
Efficient spin injection from epitaxial ferrimagnetic NiFe$_2$O$_4$ thin films into a Pd layer is demonstrated via spin Seebeck effect measurements in the longitudinal geometry. The NiFe$_2$O$_4$ films (60 nm to 1 $μ$m) are grown by pulsed laser deposition on isostructural spinel MgAl$_2$O$_4$, MgGa$_2$O$_4$, and CoGa$_2$O$_4$ substrates with lattice mismatch varying between 3.2% and 0.2%. For the…
▽ More
Efficient spin injection from epitaxial ferrimagnetic NiFe$_2$O$_4$ thin films into a Pd layer is demonstrated via spin Seebeck effect measurements in the longitudinal geometry. The NiFe$_2$O$_4$ films (60 nm to 1 $μ$m) are grown by pulsed laser deposition on isostructural spinel MgAl$_2$O$_4$, MgGa$_2$O$_4$, and CoGa$_2$O$_4$ substrates with lattice mismatch varying between 3.2% and 0.2%. For the thinner films ($\leq$ 330 nm), an increase in the spin Seebeck voltage is observed with decreasing lattice mismatch, which correlates well with a decrease in the Gilbert damping parameter as determined from ferromagnetic resonance measurements. High resolution transmission electron microscopy studies indicate substantial decrease of antiphase boundary and interface defects that cause strain-relaxation, i.e., misfit dislocations, in the films with decreasing lattice mismatch. This highlights the importance of reducing structural defects in spinel ferrites for efficient spin injection. It is further shown that angle-dependent spin Seebeck effect measurements provide a qualitative method to probe for in-plane magnetic anisotropies present in the films.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Deep Gated Networks: A framework to understand training and generalisation in deep learning
Authors:
Chandrashekar Lakshminarayanan,
Amit Vikram Singh
Abstract:
Understanding the role of (stochastic) gradient descent (SGD) in the training and generalisation of deep neural networks (DNNs) with ReLU activation has been the object study in the recent past. In this paper, we make use of deep gated networks (DGNs) as a framework to obtain insights about DNNs with ReLU activation. In DGNs, a single neuronal unit has two components namely the pre-activation inpu…
▽ More
Understanding the role of (stochastic) gradient descent (SGD) in the training and generalisation of deep neural networks (DNNs) with ReLU activation has been the object study in the recent past. In this paper, we make use of deep gated networks (DGNs) as a framework to obtain insights about DNNs with ReLU activation. In DGNs, a single neuronal unit has two components namely the pre-activation input (equal to the inner product the weights of the layer and the previous layer outputs), and a gating value which belongs to $[0,1]$ and the output of the neuronal unit is equal to the multiplication of pre-activation input and the gating value. The standard DNN with ReLU activation, is a special case of the DGNs, wherein the gating value is $1/0$ based on whether or not the pre-activation input is positive or negative. We theoretically analyse and experiment with several variants of DGNs, each variant suited to understand a particular aspect of either training or generalisation in DNNs with ReLU activation. Our theory throws light on two questions namely i) why increasing depth till a point helps in training and ii) why increasing depth beyond a point hurts training? We also present experimental evidence to show that gate adaptation, i.e., the change of gating value through the course of training is key for generalisation.
△ Less
Submitted 2 March, 2020; v1 submitted 10 February, 2020;
originally announced February 2020.
-
Convergence Analysis of Nonconvex ADMM for Rigid Registration
Authors:
Aditya V. Singh,
Kunal N. Chaudhury
Abstract:
We consider the problem of rigid registration, where we wish to jointly register multiple point sets via rigid transforms. This arises in applications such as sensor network localization, multiview registration, and protein structure determination. The least-squares estimator for this problem can be reduced to a rank-constrained semidefinite program (REG-SDP). It was recently shown that by formall…
▽ More
We consider the problem of rigid registration, where we wish to jointly register multiple point sets via rigid transforms. This arises in applications such as sensor network localization, multiview registration, and protein structure determination. The least-squares estimator for this problem can be reduced to a rank-constrained semidefinite program (REG-SDP). It was recently shown that by formally applying the alternating direction method of multipliers (ADMM), we can derive an iterative solver (REG-ADMM) for REG-SDP, wherein each subproblem admits a simple closed-form solution. The empirical success of REG-ADMM has been demonstrated for multiview registration. However, its convergence does not follow from the existing literature on nonconvex ADMM. In this work, we study the convergence of REG-ADMM and our main findings are as follows. We prove that any fixed point of REG-ADMM is a stationary (KKT) point of REG-SDP. Moreover, for clean measurements, we give an explicit formula for the ADMM parameter $ρ$, for which REG-ADMM is guaranteed to converge to the global optimum (with arbitrary initialization). If the noise is low, we can still show that the iterates converge to the global optimum, provided they are initialized sufficiently close to the optimum. On the other hand, if the noise is high, we explain why REG-ADMM becomes unstable if $ρ$ is less than some threshold, irrespective of the initialization. We present simulation results to support our theoretical predictions. The novelty of our analysis lies in the fact that we exploit the notion of tightness of convex relaxation to arrive at our convergence results.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
On Uniquely Registrable Networks
Authors:
Aditya V. Singh,
Kunal N. Chaudhury
Abstract:
Consider a network with $N$ nodes in $d$-dimensional Euclidean space, and $M$ subsets of these nodes $P_1,\cdots,P_M$. Assume that the nodes in a given $P_i$ are observed in a local coordinate system. The registration problem is to compute the coordinates of the $N$ nodes in a global coordinate system, given the information about $P_1,\cdots,P_M$ and the corresponding local coordinates. The networ…
▽ More
Consider a network with $N$ nodes in $d$-dimensional Euclidean space, and $M$ subsets of these nodes $P_1,\cdots,P_M$. Assume that the nodes in a given $P_i$ are observed in a local coordinate system. The registration problem is to compute the coordinates of the $N$ nodes in a global coordinate system, given the information about $P_1,\cdots,P_M$ and the corresponding local coordinates. The network is said to be uniquely registrable if the global coordinates can be computed uniquely (modulo Euclidean transforms). We formulate a necessary and sufficient condition for a network to be uniquely registrable in terms of rigidity of the body graph of the network. A particularly simple characterization of unique registrability is obtained for planar networks. Further, we show that $k$-vertex-connectivity of the body graph is equivalent to quasi $k$-connectivity of the bipartite correspondence graph of the network. Along with results from rigidity theory, this helps us resolve a recent conjecture due to Sanyal et al. (IEEE TSP, 2017) that quasi $3$-connectivity of the correspondence graph is both necessary and sufficient for unique registrability in two dimensions. We present counterexamples demonstrating that while quasi $(d+1)$-connectivity is necessary for unique registrability in any dimension, it fails to be sufficient in three and higher dimensions.
△ Less
Submitted 24 June, 2019;
originally announced June 2019.
-
Vectorial observation of the spin Seebeck effect in epitaxial NiFe$_2$O$_4$ thin films with various magnetic anisotropy contributions
Authors:
Zhong Li,
Jan Krieft,
Amit Vikram Singh,
Sudhir Regmi,
Ankur Rastogi,
Abhishek Srivastava,
Zbigniew Galazka,
Tim Mewes,
Arunava Gupta,
Timo Kuschel
Abstract:
We have developed a vectorial type of measurement for the spin Seebeck effect (SSE) in epitaxial NiFe$_2$O$_4$ thin films which have been grown by pulsed laser deposition on MgGa$_2$O$_4$ (MGO) with (001) and (011) orientation as well as CoGa$_2$O$_4$ (011) (CGO), thus varying the lattice mismatch and crystal orientation. We confirm that a large lattice mismatch leads to strain anisotropy in addit…
▽ More
We have developed a vectorial type of measurement for the spin Seebeck effect (SSE) in epitaxial NiFe$_2$O$_4$ thin films which have been grown by pulsed laser deposition on MgGa$_2$O$_4$ (MGO) with (001) and (011) orientation as well as CoGa$_2$O$_4$ (011) (CGO), thus varying the lattice mismatch and crystal orientation. We confirm that a large lattice mismatch leads to strain anisotropy in addition to the magnetocrystalline anisotropy in the thin films using vibrating sample magnetometry and ferromagnetic resonance measurements. Moreover, we show that the existence of a magnetic strain anisotropy in NiFe$_2$O$_4$ thin films significantly impacts the shape and magnitude of the magnetic-field-dependent SSE voltage loops. We further demonstrate that bidirectional field-dependent SSE voltage curves can be utilized to reveal the complete magnetization reversal process, which establishes a vectorial magnetometry technique based on a spin caloric effect.
△ Less
Submitted 12 May, 2019; v1 submitted 14 February, 2019;
originally announced February 2019.
-
Current direction anisotropy of the spin Hall magnetoresistance in nickel ferrite thin films with bulk-like magnetic properties
Authors:
Matthias Althammer,
Amit Vikam Singh,
Tobias Wimmer,
Zbigniew Galazka,
Hans Huebl,
Matthias Opel,
Rudolf Gross,
Arunava Gupta
Abstract:
We utilize spin Hall magnetoresistance (SMR) measurements to experimentally investigate the pure spin current transport and magnetic properties of nickel ferrite (NiFe2O4,NFO)/normal metal (NM) thin film heterostructures. We use (001)-oriented NFO thin films grown on lattice-matched magnesium gallate substrates by pulsed laser deposition, which significantly improves the magnetic and structural pr…
▽ More
We utilize spin Hall magnetoresistance (SMR) measurements to experimentally investigate the pure spin current transport and magnetic properties of nickel ferrite (NiFe2O4,NFO)/normal metal (NM) thin film heterostructures. We use (001)-oriented NFO thin films grown on lattice-matched magnesium gallate substrates by pulsed laser deposition, which significantly improves the magnetic and structural properties of the ferrimagnetic insulator. The NM in our experiments is either Pt or Ta. A comparison of the obtained SMR magnitude for charge currents applied in the [100]- and [110]-direction of NFO yields a change of 50% for Pt at room temperature. We also investigated the temperature dependence of this current direction anisotropy and find that it is qualitatively different for the conductivity and the SMR magnitude. From our results we conclude that the observed current direction anisotropy may originate from an anisotropy of the spin mixing conductance or of the spin Hall effect in these Pt and Ta layers, and/or additional spin-galvanic contributions from the NFO/NM interface.
△ Less
Submitted 12 November, 2018;
originally announced November 2018.
-
Enhanced magnon spin transport in NiFe$_2$O$_4$ thin films on a lattice-matched substrate
Authors:
J. Shan,
A. V. Singh,
L. Liang,
L. Cornelissen,
A. Gupta,
B. J. van Wees,
T. Kuschel
Abstract:
We investigate magnon spin transport in epitaxial nickel ferrite (NiFe$_2$O$_4$, NFO) films grown on magnesium gallate spinel (MgGa$_2$O$_4$, MGO) substrates, which have a lattice mismatch with NFO as small as 0.78%, resulting in the reduction of antiphase boundary defects and thus in improved magnetic properties in the NFO films. In the nonlocal transport experiments, enhanced signals are observe…
▽ More
We investigate magnon spin transport in epitaxial nickel ferrite (NiFe$_2$O$_4$, NFO) films grown on magnesium gallate spinel (MgGa$_2$O$_4$, MGO) substrates, which have a lattice mismatch with NFO as small as 0.78%, resulting in the reduction of antiphase boundary defects and thus in improved magnetic properties in the NFO films. In the nonlocal transport experiments, enhanced signals are observed for both electrically and thermally excited magnons, and the magnon relaxation length ($λ_m$) of NFO is found to be around 2.5 $μ$m at room temperature. Moreover, at both room and low temperatures, we present distinct features from the nonlocal spin Seebeck signals which arise from magnon polaron formation. Our results demonstrate excellent magnon transport properties (magnon spin conductivity, $λ_m$ and spin mixing conductance at the interface between Pt) of NFO films grown on a lattice-matched substrate that are comparable with those of yttrium iron garnet.
△ Less
Submitted 24 July, 2018;
originally announced July 2018.
-
On Euclidean $k$-Means Clustering with $α$-Center Proximity
Authors:
Amit Deshpande,
Anand Louis,
Apoorv Vikram Singh
Abstract:
$k$-means clustering is NP-hard in the worst case but previous work has shown efficient algorithms assuming the optimal $k…
▽ More
$k$-means clustering is NP-hard in the worst case but previous work has shown efficient algorithms assuming the optimal $k$-means clusters are \emph{stable} under additive or multiplicative perturbation of data. This has two caveats. First, we do not know how to efficiently verify this property of optimal solutions that are NP-hard to compute in the first place. Second, the stability assumptions required for polynomial time $k$-means algorithms are often unreasonable when compared to the ground-truth clusters in real-world data. A consequence of multiplicative perturbation resilience is \emph{center proximity}, that is, every point is closer to the center of its own cluster than the center of any other cluster, by some multiplicative factor $α> 1$.
We study the problem of minimizing the Euclidean $k$-means objective only over clusterings that satisfy $α$-center proximity. We give a simple algorithm to find the optimal $α$-center-proximal $k$-means clustering in running time exponential in $k$ and $1/(α- 1)$ but linear in the number of points and the dimension. We define an analogous $α$-center proximity condition for outliers, and give similar algorithmic guarantees for $k$-means with outliers and $α$-center proximity. On the hardness side we show that for any $α' > 1$, there exists an $α\leq α'$, $(α>1)$, and an $\varepsilon_0 > 0$ such that minimizing the $k$-means objective over clusterings that satisfy $α$-center proximity is NP-hard to approximate within a multiplicative $(1+\varepsilon_0)$ factor.
△ Less
Submitted 26 February, 2019; v1 submitted 28 April, 2018;
originally announced April 2018.
-
Investigation of the tunnel magnetoresistance in junctions with a strontium stannate barrier
Authors:
Matthias Althammer,
Amit Vikam Singh,
Sahar Keshavarz,
Mehmet Kenan Yurtisigi,
Rohan Mishra,
Albina Borisevich,
Patrick LeClair,
Arunava Gupta
Abstract:
We experimentally investigate the structural, magnetic and electrical transport properties of La$_{0.67}$Sr$_{0.33}$MnO$_{3}$ based magnetic tunnel junctions with a SrSnO$_3$ barrier. Our results show that despite the large number of defects in the strontium stannate barrier, due to the large lattice mismatch, the observed tunnel magnetoresistance is comparable to tunnel junctions with a better la…
▽ More
We experimentally investigate the structural, magnetic and electrical transport properties of La$_{0.67}$Sr$_{0.33}$MnO$_{3}$ based magnetic tunnel junctions with a SrSnO$_3$ barrier. Our results show that despite the large number of defects in the strontium stannate barrier, due to the large lattice mismatch, the observed tunnel magnetoresistance is comparable to tunnel junctions with a better lattice matched STiO$_3$ barrier, reaching values of up to 350% at T=5 K. Further analysis of the current-voltage characteristics of the junction and the bias voltage dependence of the observed tunnel magnetoresistance show a decrease of the TMR with increasing bias voltage. In addition, the observed TMR vanishes for T>200 K. Our results suggest that by employing a better lattice matched ferromagnetic electrode and thus reducing the structural defects in the strontium stannate barrier even larger TMR ratios might be possible in the future.
△ Less
Submitted 28 July, 2016;
originally announced July 2016.
-
Studies On Falling Ball Viscometry
Authors:
Amit Vikram Singh,
Lavanjay Sharma,
Pinaki Gupta-Bhaya
Abstract:
A new method of accurate calculation of the coefficient of viscosity of a test liquid from experimentally measured terminal velocity of a ball falling in the test liquid contained in a narrow tube is described. The calculation requires the value of a multiplicative correction factor to the apparent coefficient of viscosity calculated by substitution of terminal velocity of the falling ball in Stok…
▽ More
A new method of accurate calculation of the coefficient of viscosity of a test liquid from experimentally measured terminal velocity of a ball falling in the test liquid contained in a narrow tube is described. The calculation requires the value of a multiplicative correction factor to the apparent coefficient of viscosity calculated by substitution of terminal velocity of the falling ball in Stokes formula. This correction factor, the so-called viscosity ratio, a measure of deviation from Stokes limit, arises from non-vanishing values of the Reynolds number and the ball/tube radius ratio. The method, valid over a very wide range of Reynolds number, is based on the recognition of a relationship between two measures of wall effect, the more widely investigated velocity ratio, defined as the ratio of terminal velocity in a confined medium to that in a boundless medium and viscosity ratio. The calculation uses two recently published correlation formulae based on extensive experimental results on terminal velocity of a falling ball. The first formula relates velocity ratio to Reynolds number and ball-tube radius ratio. The second formula gives an expression of the ratio of the drag force actually sensed by the ball falling in an infinite medium to that in the Stokes limit as a function of Reynolds number alone. It is shown that appropriate use of this correction factor extends the utility of the technique of falling ball viscometry beyond the very low Reynolds number 'creepy flow' regime, to which its application is presently restricted. Issues related to accuracy are examined by use of our own measurements of the terminal velocity of a falling ball in a narrow tube and that of published literature reports, on liquids of known viscosity coefficient.
△ Less
Submitted 13 February, 2012; v1 submitted 7 February, 2012;
originally announced February 2012.