Search | arXiv e-print repository

Direct measurement of the viscocapillary lift force near a liquid interface

Authors: Hao Zhang, Zaicheng Zhang, Aditya Jha, Yacine Amarouchene, Thomas Salez, Thomas Guérin, Chaouqi Misbah, Abdelhamid Maali

Abstract: Lift force of viscous origin is widespread across disciplines, from mechanics to biology. Here, we present the first direct measurement of the lift force acting on a particle moving in a viscous fluid along the liquid interface that separates two liquids. The force arises from the coupling between the viscous flow induced by the particle motion and the capillary deformation of the interface. The m… ▽ More Lift force of viscous origin is widespread across disciplines, from mechanics to biology. Here, we present the first direct measurement of the lift force acting on a particle moving in a viscous fluid along the liquid interface that separates two liquids. The force arises from the coupling between the viscous flow induced by the particle motion and the capillary deformation of the interface. The measurements show that the lift force increases as the distance between the sphere and the interface decreases, reaching saturation at small distances. The experimental results are in good agreement with the model and numerical calculation developed within the framework of the soft lubrication theory. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.13209 [pdf, other]

Investigating Symbolic Capabilities of Large Language Models

Authors: Neisarg Dave, Daniel Kifer, C. Lee Giles, Ankur Mali

Abstract: Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to b… ▽ More Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to bridge this gap by rigorously evaluating LLMs on a series of symbolic tasks, such as addition, multiplication, modulus arithmetic, numerical precision, and symbolic counting. Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks. The assessment framework is anchored in Chomsky's Hierarchy, providing a robust measure of the computational abilities of these models. The evaluation employs minimally explained prompts alongside the zero-shot Chain of Thoughts technique, allowing models to navigate the solution process autonomously. The findings reveal a significant decline in LLMs' performance on context-free and context-sensitive symbolic tasks as the complexity, represented by the number of symbols, increases. Notably, even the fine-tuned GPT3.5 exhibits only marginal improvements, mirroring the performance trends observed in other models. Across the board, all models demonstrated a limited generalization ability on these symbol-intensive tasks. This research underscores LLMs' challenges with increasing symbolic complexity and highlights the need for specialized training, memory and architectural adjustments to enhance their proficiency in symbol-based reasoning tasks. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2403.18929 [pdf, other]

A Review of Neuroscience-Inspired Machine Learning

Authors: Alexander Ororbia, Ankur Mali, Adam Kohan, Beren Millidge, Tommaso Salvatori

Abstract: One major criticism of deep learning centers around the biological implausibility of the credit assignment schema used for learning -- backpropagation of errors. This implausibility translates into practical limitations, spanning scientific fields, including incompatibility with hardware and non-differentiable implementations, thus leading to expensive energy requirements. In contrast, biologicall… ▽ More One major criticism of deep learning centers around the biological implausibility of the credit assignment schema used for learning -- backpropagation of errors. This implausibility translates into practical limitations, spanning scientific fields, including incompatibility with hardware and non-differentiable implementations, thus leading to expensive energy requirements. In contrast, biologically plausible credit assignment is compatible with practically any learning condition and is energy-efficient. As a result, it accommodates hardware and scientific modeling, e.g. learning with physical systems and non-differentiable behavior. Furthermore, it can lead to the development of real-time, adaptive neuromorphic processing systems. In addressing this problem, an interdisciplinary branch of artificial intelligence research that lies at the intersection of neuroscience, cognitive science, and machine learning has emerged. In this paper, we survey several vital algorithms that model bio-plausible rules of credit assignment in artificial neural networks, discussing the solutions they provide for different scientific fields as well as their advantages on CPUs, GPUs, and novel implementations of neuromorphic hardware. We conclude by discussing the future challenges that will need to be addressed in order to make such algorithms more useful in practical applications. △ Less

Submitted 16 February, 2024; originally announced March 2024.

Comments: 13 Pages, 1 figure

arXiv:2402.12465 [pdf, other]

Neuro-mimetic Task-free Unsupervised Online Learning with Continual Self-Organizing Maps

Authors: Hitesh Vaidya, Travis Desell, Ankur Mali, Alexander Ororbia

Abstract: An intelligent system capable of continual learning is one that can process and extract knowledge from potentially infinitely long streams of pattern vectors. The major challenge that makes crafting such a system difficult is known as catastrophic forgetting - an agent, such as one based on artificial neural networks (ANNs), struggles to retain previously acquired knowledge when learning from new… ▽ More An intelligent system capable of continual learning is one that can process and extract knowledge from potentially infinitely long streams of pattern vectors. The major challenge that makes crafting such a system difficult is known as catastrophic forgetting - an agent, such as one based on artificial neural networks (ANNs), struggles to retain previously acquired knowledge when learning from new samples. Furthermore, ensuring that knowledge is preserved for previous tasks becomes more challenging when input is not supplemented with task boundary information. Although forgetting in the context of ANNs has been studied extensively, there still exists far less work investigating it in terms of unsupervised architectures such as the venerable self-organizing map (SOM), a neural model often used in clustering and dimensionality reduction. While the internal mechanisms of SOMs could, in principle, yield sparse representations that improve memory retention, we observe that, when a fixed-size SOM processes continuous data streams, it experiences concept drift. In light of this, we propose a generalization of the SOM, the continual SOM (CSOM), which is capable of online unsupervised learning under a low memory budget. Our results, on benchmarks including MNIST, Kuzushiji-MNIST, and Fashion-MNIST, show almost a two times increase in accuracy, and CIFAR-10 demonstrates a state-of-the-art result when tested on (online) unsupervised class incremental learning setting. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.02790 [pdf, other]

Stable and Robust Deep Learning By Hyperbolic Tangent Exponential Linear Unit (TeLU)

Authors: Alfredo Fernandez, Ankur Mali

Abstract: In this paper, we introduce the Hyperbolic Tangent Exponential Linear Unit (TeLU), a novel neural network activation function, represented as $f(x) = x{\cdot}tanh(e^x)$. TeLU is designed to overcome the limitations of conventional activation functions like ReLU, GELU, and Mish by addressing the vanishing and, to an extent, the exploding gradient problems. Our theoretical analysis and empirical ass… ▽ More In this paper, we introduce the Hyperbolic Tangent Exponential Linear Unit (TeLU), a novel neural network activation function, represented as $f(x) = x{\cdot}tanh(e^x)$. TeLU is designed to overcome the limitations of conventional activation functions like ReLU, GELU, and Mish by addressing the vanishing and, to an extent, the exploding gradient problems. Our theoretical analysis and empirical assessments reveal that TeLU outperforms existing activation functions in stability and robustness, effectively adjusting activation outputs' mean towards zero for enhanced training stability and convergence. Extensive evaluations against popular activation functions (ReLU, GELU, SiLU, Mish, Logish, Smish) across advanced architectures, including Resnet-50, demonstrate TeLU's lower variance and superior performance, even under hyperparameter conditions optimized for other functions. In large-scale tests with challenging datasets like CIFAR-10, CIFAR-100, and TinyImageNet, encompassing 860 scenarios, TeLU consistently showcased its effectiveness, positioning itself as a potential new standard for neural network activation functions, boosting stability and performance in diverse deep learning applications. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.02627 [pdf, other]

Stability Analysis of Various Symbolic Rule Extraction Methods from Recurrent Neural Network

Authors: Neisarg Dave, Daniel Kifer, C. Lee Giles, Ankur Mali

Abstract: This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RN… ▽ More This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RNN, and MIRNN. The observations from our experiments establish the superior performance of O2RNN and quantization-based rule extraction over others. $L^{*}$, primarily proposed for regular grammars, performs similarly to quantization methods for Tomita languages when neural networks are perfectly trained. However, for partially trained RNNs, $L^{*}$ shows instability in the number of states in DFA, e.g., for Tomita 5 and Tomita 6 languages, $L^{*}$ produced more than $100$ states. In contrast, quantization methods result in rules with number of states very close to ground truth DFA. Among RNN cells, O2RNN produces stable DFA consistently compared to other cells. For Dyck Languages, we observe that although GRU outperforms other RNNs in network performance, the DFA extracted by O2RNN has higher performance and better stability. The stability is computed as the standard deviation of accuracy on test sets on networks trained across $10$ seeds. On Dyck Languages, quantization methods outperformed $L^{*}$ with better stability in accuracy and the number of states. $L^{*}$ often showed instability in accuracy in the order of $16\% - 22\%$ for GRU and MIRNN while deviation for quantization methods varied in $5\% - 15\%$. In many instances with LSTM and GRU, DFA's extracted by $L^{*}$ even failed to beat chance accuracy ($50\%$), while those extracted by quantization method had standard deviation in the $7\%-17\%$ range. For O2RNN, both rule extraction methods had deviation in the $0.5\% - 3\%$ range. △ Less

Submitted 4 February, 2024; originally announced February 2024.

arXiv:2309.14691 [pdf, other]

On the Computational Complexity and Formal Hierarchy of Second Order Recurrent Neural Networks

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Artificial neural networks (ANNs) with recurrence and self-attention have been shown to be Turing-complete (TC). However, existing work has shown that these ANNs require multiple turns or unbounded computation time, even with unbounded precision in weights, in order to recognize TC grammars. However, under constraints such as fixed or bounded precision neurons and time, ANNs without memory are sho… ▽ More Artificial neural networks (ANNs) with recurrence and self-attention have been shown to be Turing-complete (TC). However, existing work has shown that these ANNs require multiple turns or unbounded computation time, even with unbounded precision in weights, in order to recognize TC grammars. However, under constraints such as fixed or bounded precision neurons and time, ANNs without memory are shown to struggle to recognize even context-free languages. In this work, we extend the theoretical foundation for the $2^{nd}$-order recurrent network ($2^{nd}$ RNN) and prove there exists a class of a $2^{nd}$ RNN that is Turing-complete with bounded time. This model is capable of directly encoding a transition table into its recurrent weights, enabling bounded time computation and is interpretable by design. We also demonstrate that $2$nd order RNNs, without memory, under bounded weights and time constraints, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars. We provide an upper bound and a stability analysis on the maximum number of neurons required by $2$nd order RNNs to recognize any class of regular grammar. Extensive experiments on the Tomita grammars support our findings, demonstrating the importance of tensor connections in crafting computationally efficient RNNs. Finally, we show $2^{nd}$ order RNNs are also interpretable by extraction and can extract state machines with higher success rates as compared to first-order RNNs. Our results extend the theoretical foundations of RNNs and offer promising avenues for future explainable AI research. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 12 pages, 5 tables, 1 figure

arXiv:2309.14690 [pdf, ps, other]

On the Tensor Representation and Algebraic Homomorphism of the Neural State Turing Machine

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Recurrent neural networks (RNNs) and transformers have been shown to be Turing-complete, but this result assumes infinite precision in their hidden representations, positional encodings for transformers, and unbounded computation time in general. In practical applications, however, it is crucial to have real-time models that can recognize Turing complete grammars in a single pass. To address this… ▽ More Recurrent neural networks (RNNs) and transformers have been shown to be Turing-complete, but this result assumes infinite precision in their hidden representations, positional encodings for transformers, and unbounded computation time in general. In practical applications, however, it is crucial to have real-time models that can recognize Turing complete grammars in a single pass. To address this issue and to better understand the true computational power of artificial neural networks (ANNs), we introduce a new class of recurrent models called the neural state Turing machine (NSTM). The NSTM has bounded weights and finite-precision connections and can simulate any Turing Machine in real-time. In contrast to prior work that assumes unbounded time and precision in weights, to demonstrate equivalence with TMs, we prove that a $13$-neuron bounded tensor RNN, coupled with third-order synapses, can model any TM class in real-time. Furthermore, under the Markov assumption, we provide a new theoretical bound for a non-recurrent network augmented with memory, showing that a tensor feedforward network with $25$th-order finite precision weights is equivalent to a universal TM. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 14 pages, 7 tables

arXiv:2308.07870 [pdf, other]

Brain-Inspired Computational Intelligence via Predictive Coding

Authors: Tommaso Salvatori, Ankur Mali, Christopher L. Buckley, Thomas Lukasiewicz, Rajesh P. N. Rao, Karl Friston, Alexander Ororbia

Abstract: Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying unc… ▽ More Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large. △ Less

Submitted 15 August, 2023; originally announced August 2023.

Comments: 37 Pages, 9 Figures

arXiv:2307.05991 [pdf, other]

Unsteady drag force on an immersed sphere oscillating near a wall

Authors: Zaicheng Zhang, Vincent Bertin, Martin Essink, Hao Zhang, Nicolas Fares, Zaiyi Shen, Thomas Bickel, Thomas Salez, Abdelhamid Maali

Abstract: The unsteady hydrodynamic drag exerted on an oscillating sphere near a planar wall is addressed experimentally, theoretically, and numerically. The experiments are performed by using colloidal-probe Atomic Force Microscopy (AFM) in thermal noise mode. The natural resonance frequencies and quality factors are extracted from the measurement of the power spectrum density of the probe oscillation for… ▽ More The unsteady hydrodynamic drag exerted on an oscillating sphere near a planar wall is addressed experimentally, theoretically, and numerically. The experiments are performed by using colloidal-probe Atomic Force Microscopy (AFM) in thermal noise mode. The natural resonance frequencies and quality factors are extracted from the measurement of the power spectrum density of the probe oscillation for a broad range of gap distances and Womersley numbers. The shift in the natural resonance frequency of the colloidal probe as the probe goes close to a solid wall infers the wall-induced variations of the effective mass of the probe. Interestingly, a crossover from a positive to a negative shift is observed as the Womersley number increases. In order to rationalize the results, the confined unsteady Stokes equation is solved numerically using a finite-element method, as well as asymptotic calculations.The in-phase and out-of-phase terms of the hydrodynamic drag acting on the sphere are obtained and agree well to the experimental results. All together, the experimental, theoretical, and numerical results show that the hydrodynamic force felt by an immersed sphere oscillating near a wall is highly dependent on the Womersley number. △ Less

Submitted 12 July, 2023; originally announced July 2023.

arXiv:2301.01452 [pdf, other]

The Predictive Forward-Forward Algorithm

Authors: Alexander Ororbia, Ankur Mali

Abstract: We propose the predictive forward-forward (PFF) algorithm for conducting credit assignment in neural systems. Specifically, we design a novel, dynamic recurrent neural system that learns a directed generative circuit jointly and simultaneously with a representation circuit. Notably, the system integrates learnable lateral competition, noise injection, and elements of predictive coding, an emerging… ▽ More We propose the predictive forward-forward (PFF) algorithm for conducting credit assignment in neural systems. Specifically, we design a novel, dynamic recurrent neural system that learns a directed generative circuit jointly and simultaneously with a representation circuit. Notably, the system integrates learnable lateral competition, noise injection, and elements of predictive coding, an emerging and viable neurobiological process theory of cortical function, with the forward-forward (FF) adaptation scheme. Furthermore, PFF efficiently learns to propagate learning signals and updates synapses with forward passes only, eliminating key structural and computational constraints imposed by backpropagation-based schemes. Besides computational advantages, the PFF process could prove useful for understanding the learning mechanisms behind biological neurons that use local signals despite missing feedback connections. We run experiments on image data and demonstrate that the PFF procedure works as well as backpropagation, offering a promising brain-inspired algorithm for classifying, reconstructing, and synthesizing data patterns. △ Less

Submitted 2 April, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

Comments: More revisions/edits, update to key diagram depicting PFF process, link to algorithm / simulation code (repo) now included

arXiv:2211.12047 [pdf, other]

Convolutional Neural Generative Coding: Scaling Predictive Coding to Natural Images

Authors: Alexander Ororbia, Ankur Mali

Abstract: In this work, we develop convolutional neural generative coding (Conv-NGC), a generalization of predictive coding to the case of convolution/deconvolution-based computation. Specifically, we concretely implement a flexible neurobiologically-motivated algorithm that progressively refines latent state feature maps in order to dynamically form a more accurate internal representation/reconstruction mo… ▽ More In this work, we develop convolutional neural generative coding (Conv-NGC), a generalization of predictive coding to the case of convolution/deconvolution-based computation. Specifically, we concretely implement a flexible neurobiologically-motivated algorithm that progressively refines latent state feature maps in order to dynamically form a more accurate internal representation/reconstruction model of natural images. The performance of the resulting sensory processing system is evaluated on complex datasets such as Color-MNIST, CIFAR-10, and Street House View Numbers (SVHN). We study the effectiveness of our brain-inspired model on the tasks of reconstruction and image denoising and find that it is competitive with convolutional auto-encoding systems trained by backpropagation of errors and outperforms them with respect to out-of-distribution reconstruction (including the full 90k CINIC-10 test set). △ Less

Submitted 5 February, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: Revisions/updates, expanded appendix

arXiv:2210.05487 [pdf, other]

Like a bilingual baby: The advantage of visually grounding a bilingual language model

Authors: Khai-Nguyen Nguyen, Zixin Tang, Ankur Mali, Alex Kelly

Abstract: Unlike most neural language models, humans learn language in a rich, multi-sensory and, often, multi-lingual environment. Current language models typically fail to fully capture the complexities of multilingual language use. We train an LSTM language model on images and captions in English and Spanish from MS-COCO-ES. We find that the visual grounding improves the model's understanding of semantic… ▽ More Unlike most neural language models, humans learn language in a rich, multi-sensory and, often, multi-lingual environment. Current language models typically fail to fully capture the complexities of multilingual language use. We train an LSTM language model on images and captions in English and Spanish from MS-COCO-ES. We find that the visual grounding improves the model's understanding of semantic similarity both within and across languages and improves perplexity. However, we find no significant advantage of visual grounding for abstract words. Our results provide additional evidence of the advantages of visually grounded language models and point to the need for more naturalistic language data from multilingual speakers and multilingual datasets with perceptual grounding. △ Less

Submitted 13 February, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

Comments: Preprint, 7 pages, 2 tables, 1 figure

arXiv:2209.09174 [pdf, other]

Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems

Authors: Alexander Ororbia, Ankur Mali

Abstract: In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive… ▽ More In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive agent system, which we call active predictive coding (ActPC), that balances an internally-generated epistemic signal (meant to encourage intelligent exploration) with an internally-generated instrumental signal (meant to encourage goal-seeking behavior) to ultimately learn how to control various simulated robotic systems as well as a complex robotic arm using a realistic robotics simulator, i.e., the Surreal Robotics Suite, for the block lifting task and can pick-and-place problems. Notably, our experimental results demonstrate that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Comments: Contains appendix with pseudocode and additional details

arXiv:2206.01820 [pdf, other]

A Robust Backpropagation-Free Framework for Images

Authors: Timothy Zee, Alexander G. Ororbia, Ankur Mali, Ifeoma Nwogu

Abstract: While current deep learning algorithms have been successful for a wide variety of artificial intelligence (AI) tasks, including those involving structured image data, they present deep neurophysiological conceptual issues due to their reliance on the gradients that are computed by backpropagation of errors (backprop). Gradients are required to obtain synaptic weight adjustments but require knowled… ▽ More While current deep learning algorithms have been successful for a wide variety of artificial intelligence (AI) tasks, including those involving structured image data, they present deep neurophysiological conceptual issues due to their reliance on the gradients that are computed by backpropagation of errors (backprop). Gradients are required to obtain synaptic weight adjustments but require knowledge of feed-forward activities in order to conduct backward propagation, a biologically implausible process. This is known as the "weight transport problem". Therefore, in this work, we present a more biologically plausible approach towards solving the weight transport problem for image data. This approach, which we name the error kernel driven activation alignment (EKDAA) algorithm, accomplishes through the introduction of locally derived error transmission kernels and error maps. Like standard deep learning networks, EKDAA performs the standard forward process via weights and activation functions; however, its backward error computation involves adaptive error kernels that propagate local error signals through the network. The efficacy of EKDAA is demonstrated by performing visual-recognition tasks on the Fashion MNIST, CIFAR-10 and SVHN benchmarks, along with demonstrating its ability to extract visual features from natural color images. Furthermore, in order to demonstrate its non-reliance on gradient computations, results are presented for an EKDAA trained CNN that employs a non-differentiable activation function. △ Less

Submitted 5 November, 2023; v1 submitted 3 June, 2022; originally announced June 2022.

arXiv:2202.04386 [pdf]

Contactless Rheology of Soft Gels over a Broad Frequency Range

Authors: Zaicheng Zhang, Muhammad Arshad, Vincent Bertin, Samir Almohamad, Elie Raphaël, Thomas Salez, Abdelhamid Maali

Abstract: We report contactless measurements of the viscoelastic rheological properties of soft gels. The experiments are performed using a colloidal-probe Atomic Force Microscope (AFM) in a liquid environment and in dynamic mode. The mechanical response is measured as a function of the liquid gap thickness for different oscillation frequencies. Our measurements reveal an elastohydrodynamic (EHD) coupling b… ▽ More We report contactless measurements of the viscoelastic rheological properties of soft gels. The experiments are performed using a colloidal-probe Atomic Force Microscope (AFM) in a liquid environment and in dynamic mode. The mechanical response is measured as a function of the liquid gap thickness for different oscillation frequencies. Our measurements reveal an elastohydrodynamic (EHD) coupling between the flow induced by the probe oscillation and the viscoelastic deformation of the gels. The data are quantitatively described by a viscoelastic lubrication model. The frequency-dependent storage and loss moduli of the polydimethylsiloxane (PDMS) gels are extracted from fits of the data to the model and are in good agreement with the Chasset--Thirion law. Our results demonstrate that contactless colloidal-probe methods are powerful tools that can be used for probing soft interfaces finely over a wide range of frequencies. △ Less

Submitted 9 February, 2022; originally announced February 2022.

arXiv:2201.11795 [pdf, other]

Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Recent advances in deep learning have led to superhuman performance across a variety of applications. Recently, these methods have been successfully employed to improve the rate-distortion performance in the task of image compression. However, current methods either use additional post-processing blocks on the decoder end to improve compression or propose an end-to-end compression scheme based on… ▽ More Recent advances in deep learning have led to superhuman performance across a variety of applications. Recently, these methods have been successfully employed to improve the rate-distortion performance in the task of image compression. However, current methods either use additional post-processing blocks on the decoder end to improve compression or propose an end-to-end compression scheme based on heuristics. For the majority of these, the trained deep neural networks (DNNs) are not compatible with standard encoders and would be difficult to deply on personal computers and cellphones. In light of this, we propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends, an approach we call Neural JPEG. We propose frequency domain pre-editing and post-editing methods to optimize the distribution of the DCT coefficients at both encoder and decoder ends in order to improve the standard compression (JPEG) method. Moreover, we design and integrate a scheme for jointly learning quantization tables within this hybrid neural compression framework.Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics, such as PSNR and MS-SSIM, and generates visually appealing images with better color retention quality. △ Less

Submitted 31 January, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

Comments: Accepted in DCC 2022, 11 pages

arXiv:2201.11782 [pdf, other]

An Empirical Analysis of Recurrent Learning Algorithms In Neural Lossy Image Compression Systems

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark. However, they are slow to train (due to backprop-through-time) and, to the best of our knowledge, have not been systematically evaluated on a large variety of datasets. In this paper, we perform the first large-scale comparison of recent state-of-the-ar… ▽ More Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark. However, they are slow to train (due to backprop-through-time) and, to the best of our knowledge, have not been systematically evaluated on a large variety of datasets. In this paper, we perform the first large-scale comparison of recent state-of-the-art hybrid neural compression algorithms, while exploring the effects of alternative training strategies (when applicable). The hybrid recurrent neural decoder is a former state-of-the-art model (recently overtaken by a Google model) that can be trained using backprop-through-time (BPTT) or with alternative algorithms like sparse attentive backtracking (SAB), unbiased online recurrent optimization (UORO), and real-time recurrent learning (RTRL). We compare these training alternatives along with the Google models (GOOG and E2E) on 6 benchmark datasets. Surprisingly, we found that the model trained with SAB performs better (outperforming even BPTT), resulting in faster convergence and a better peak signal-to-noise ratio. △ Less

Submitted 27 January, 2022; originally announced January 2022.

Comments: Accepted at DCC 2021, 15 pages

arXiv:2201.01022 [pdf, other]

doi 10.1103/PhysRevE.105.064606

Electroviscous drag on squeezing motion in sphere-plane geometry

Authors: Marcela Rodriguez Matus, Zaicheng Zhang, Zouhir Benrahla, Arghya Majee, Abdelhamid Maali, Alois Würger

Abstract: Theoretically and experimentally, we study electroviscous phenomena resulting from charge-flow coupling in a nanoscale capillary. Our theoretical approach relies on Poisson-Boltzmann mean-field theory and on coupled linear relations for charge and hydrodynamic flows, including electro-osmosis and charge advection. With respect to the unperturbed Poiseuille flow, we define an electroviscous couplin… ▽ More Theoretically and experimentally, we study electroviscous phenomena resulting from charge-flow coupling in a nanoscale capillary. Our theoretical approach relies on Poisson-Boltzmann mean-field theory and on coupled linear relations for charge and hydrodynamic flows, including electro-osmosis and charge advection. With respect to the unperturbed Poiseuille flow, we define an electroviscous coupling parameter $ξ$, which turns out to be maximum where the film thickness $h_0$ is comparable to the screening length $λ$. We also present dynamic AFM data for the visco-elastic response of a confined water film in sphere-plane geometry; our theory provides a quantitative description for the electroviscous drag coefficient and the electrostatic repulsion as a function of the film thickness, with the surface charge density as the only free parameter. Charge regulation sets in at even smaller distances. △ Less

Submitted 28 May, 2022; v1 submitted 4 January, 2022; originally announced January 2022.

Comments: 11 pages, 14 figures

arXiv:2108.06361 [pdf, ps, other]

doi 10.1002/mma.8441

On tempered fractional calculus with respect to functions and the associated fractional differential equations

Authors: Ashwini D. Mali, Kishor D. Kucche, Arran Fernandez, Hafiz Muhammad Fahad

Abstract: The prime aim of the present paper is to continue developing the theory of tempered fractional integrals and derivatives of a function with respect to another function. This theory combines the tempered fractional calculus with the $Ψ$-fractional calculus, both of which have found applications in topics including continuous time random walks. After studying the basic theory of the $Ψ$-tempered ope… ▽ More The prime aim of the present paper is to continue developing the theory of tempered fractional integrals and derivatives of a function with respect to another function. This theory combines the tempered fractional calculus with the $Ψ$-fractional calculus, both of which have found applications in topics including continuous time random walks. After studying the basic theory of the $Ψ$-tempered operators, we prove mean value theorems and Taylor's theorems for both Riemann--Liouville type and Caputo type cases of these operators. Furthermore, we study some nonlinear fractional differential equations involving $Ψ$-tempered derivatives, proving existence-uniqueness theorems by using the Banach contraction principle, and proving stability results by using Grönwall type inequalities. △ Less

Submitted 18 February, 2022; v1 submitted 13 August, 2021; originally announced August 2021.

Comments: 34

arXiv:2107.07046 [pdf, other]

Backprop-Free Reinforcement Learning with Active Neural Generative Coding

Authors: Alexander Ororbia, Ankur Mali

Abstract: In humans, perceptual awareness facilitates the fast recognition and extraction of information from sensory input. This awareness largely depends on how the human agent interacts with the environment. In this work, we propose active neural generative coding, a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.… ▽ More In humans, perceptual awareness facilitates the fast recognition and extraction of information from sensory input. This awareness largely depends on how the human agent interacts with the environment. In this work, we propose active neural generative coding, a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. Specifically, we develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference. We demonstrate on several simple control problems that our framework performs competitively with deep Q-learning. The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior. △ Less

Submitted 19 December, 2021; v1 submitted 10 July, 2021; originally announced July 2021.

Comments: Updates to accepted version, experiments now include ICM and RnD baselines

arXiv:2104.09403 [pdf, other]

OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas

Authors: Shivansh Rao, Vikas Kumar, Daniel Kifer, Lee Giles, Ankur Mali

Abstract: Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, floor boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. However, the space-varying distortions in panoramic images are not compatible… ▽ More Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, floor boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. However, the space-varying distortions in panoramic images are not compatible with the translational equivariance property of standard convolutions, thus degrading performance. Instead, we propose to use spherical convolutions. The resulting network, which we call OmniLayout performs convolutions directly on the sphere surface, sampling according to inverse equirectangular projection and hence invariant to equirectangular distortions. Using a new evaluation metric, we show that our network reduces the error in the heavily distorted regions (near the poles) by approx 25 % when compared to standard convolutional networks. Experimental results show that OmniLayout outperforms the state-of-the-art by approx 4% on two different benchmark datasets (PanoContext and Stanford 2D-3D). Code is available at https://github.com/rshivansh/OmniLayout. △ Less

Submitted 19 April, 2021; originally announced April 2021.

Comments: Accepted at CVPR, OmniCV Workshop. 10 Pages, 9 Figures, 6 Tables

arXiv:2104.02899 [pdf, other]

Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, C. Lee Giles

Abstract: Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation verification, which requires determining whether trigonometric and linear algebraic statements are valid identities or not, and (2) equation completion, which entails fi… ▽ More Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation verification, which requires determining whether trigonometric and linear algebraic statements are valid identities or not, and (2) equation completion, which entails filling in a blank within an expression to make it true. Solving these tasks with deep learning requires that the neural model learn how to manipulate and compose various algebraic symbols, carrying this ability over to previously unseen expressions. Artificial neural networks, including recurrent networks and transformers, struggle to generalize on these kinds of difficult compositional problems, often exhibiting poor extrapolation performance. In contrast, recursive neural networks (recursive-NNs) are, theoretically, capable of achieving better extrapolation due to their tree-like design but are difficult to optimize as the depth of their underlying tree structure increases. To overcome this issue, we extend recursive-NNs to utilize multiplicative, higher-order synaptic connections and, furthermore, to learn to dynamically control and manipulate an external memory. We argue that this key modification gives the neural system the ability to capture powerful transition functions for each possible input. We demonstrate the effectiveness of our proposed higher-order, memory-augmented recursive-NN models on two challenging mathematical equation tasks, showing improved extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion. △ Less

Submitted 6 April, 2021; originally announced April 2021.

arXiv:2012.02949 [pdf, ps, other]

On Coupled System of Nonlinear $Ψ$-Hilfer Hybrid Fractional Differential Equations

Authors: Ashwini D. Mali, Kishor D. Kucche, J. Vanterler da C. Sousa

Abstract: This paper is dedicated to investigating the existence of solutions to the initial value problem (IVP) for a coupled system of $Ψ$-Hilfer hybrid fractional differential equations (FDEs) and boundary value problem (BVP) for a coupled system of $Ψ$-Hilfer hybrid FDEs. Analysis of the current paper depends on the two fixed point theorems involving three operators characterized on Banach algebra. In t… ▽ More This paper is dedicated to investigating the existence of solutions to the initial value problem (IVP) for a coupled system of $Ψ$-Hilfer hybrid fractional differential equations (FDEs) and boundary value problem (BVP) for a coupled system of $Ψ$-Hilfer hybrid FDEs. Analysis of the current paper depends on the two fixed point theorems involving three operators characterized on Banach algebra. In the view of an application, we provided concrete examples to exhibit the effectiveness of our achieved results. △ Less

Submitted 5 December, 2020; originally announced December 2020.

Comments: 26

arXiv:2009.14531 [pdf, other]

doi 10.1103/PhysRevResearch.3.L032007

Contactless rheology of finite-size air-water interfaces

Authors: Vincent Bertin, Zaicheng Zhang, Rodolphe Boisgard, Christine Grauby-Heywang, Elie Raphael, Thomas Salez, Abdelhamid Maali

Abstract: We present contactless atomic-force microscopy measurements of the hydrodynamic interactions between a rigid sphere and an air bubble in water at the micro-scale. The size of the bubble is found to have a significant effect on the response due to the long-range capillary deformation of the air-water interface. To rationalize the experimental data, we develop a viscocapillary lubrication model acco… ▽ More We present contactless atomic-force microscopy measurements of the hydrodynamic interactions between a rigid sphere and an air bubble in water at the micro-scale. The size of the bubble is found to have a significant effect on the response due to the long-range capillary deformation of the air-water interface. To rationalize the experimental data, we develop a viscocapillary lubrication model accounting for the finite-size effect. The comparison between experiments and theory allows us to measure the air-water surface tension, without contact, paving the way towards robust contactless tensiometry of polluted air-water interfaces. △ Less

Submitted 2 March, 2021; v1 submitted 30 September, 2020; originally announced September 2020.

Journal ref: Phys. Rev. Research 3, 032007 (2021)

arXiv:2009.09175 [pdf, other]

On the Boundary Value Problems of Ψ -Hilfer Fractional Differential Equations

Authors: Ashwini D. Mali, Kishor D. Kucche

Abstract: In the current paper, we derive the comparison results for the homogeneous and non-homogeneous linear initial value problem (IVP) for $Ψ$-Hilfer fractional differential equations. In the presence of upper and lower solutions, the obtained comparison results and the location of roots theorem utilized to prove the existence and uniqueness of the solution for the linear $Ψ$-Hilfer boundary value prob… ▽ More In the current paper, we derive the comparison results for the homogeneous and non-homogeneous linear initial value problem (IVP) for $Ψ$-Hilfer fractional differential equations. In the presence of upper and lower solutions, the obtained comparison results and the location of roots theorem utilized to prove the existence and uniqueness of the solution for the linear $Ψ$-Hilfer boundary value problem (BVP) through the linear non-homogeneous $Ψ$-Hilfer IVP. Assuming the existence of lower solution $w_0 $ and upper solution $z_0 $, we establish the existence of minimal and maximal solutions for the nonlinear $Ψ$-Hilfer BVP in the line segment $[w_0,\,z_0]$ of the weighted space $C_{1-\,γ;\, Ψ}\left( J,\,\R\right)$. Further, it demonstrated that the iterative Picard type sequences that began with lower and upper solutions respectively converges to a minimal and maximal solutions, and that started with any point on a line segment converge to the exact solution of nonlinear $Ψ$-Hilfer BVP. Finally, an example is provided in support of the main results we acquired. △ Less

Submitted 19 September, 2020; originally announced September 2020.

Comments: 30

arXiv:2009.03637 [pdf]

doi 10.1103/PhysRevLett.126.174503

Near-field probe of thermal capillary fluctuations of a hemispherical bubble

Authors: Zaicheng Zhang, Yuliang Wang, Yacine Amarouchene, Rodolphe Boisgard, Hamid Kellay, Alois Würger, Abdelhamid Maali

Abstract: We report measurements of resonant thermal capillary oscillations of a hemispherical liquid gas interface obtained using a half bubble deposited on a solid substrate. The thermal motion of the hemispherical interface is investigated using an atomic force microscope cantilever that probes the amplitude of vibrations of this interface versus frequency. The spectrum of such nanoscale thermal oscillat… ▽ More We report measurements of resonant thermal capillary oscillations of a hemispherical liquid gas interface obtained using a half bubble deposited on a solid substrate. The thermal motion of the hemispherical interface is investigated using an atomic force microscope cantilever that probes the amplitude of vibrations of this interface versus frequency. The spectrum of such nanoscale thermal oscillations of the bubble surface presents several resonance peaks and reveals that the contact line of the hemispherical bubble is pinned on the substrate. The analysis of these peaks allows to measure the surface viscosity of the bubble interface. Minute amounts of impurities are responsible for altering the rheology of the pure water surface. △ Less

Submitted 8 September, 2020; originally announced September 2020.

Comments: 11 pages, 4 figures

Journal ref: Phys. Rev. Lett. 126, 174503 (2021)

arXiv:2008.06306 [pdf, ps, other]

doi 10.1016/j.chaos.2021.111335

On the Nonlinear $Ψ$-Hilfer Hybrid Fractional Differential Equations

Authors: Kishor D. Kucche, Ashwini D. Mali

Abstract: In this paper, we initially derive the equivalent fractional integral equation to $Ψ$-Hilfer hybrid fractional differential equations and through it, we prove the existence of a solution in the weighted space. The primary objective of the paper is to obtain estimates on $Ψ$-Hilfer derivative and utilize it to derive the hybrid fractional differential inequalities involving $Ψ$-Hilfer derivative. W… ▽ More In this paper, we initially derive the equivalent fractional integral equation to $Ψ$-Hilfer hybrid fractional differential equations and through it, we prove the existence of a solution in the weighted space. The primary objective of the paper is to obtain estimates on $Ψ$-Hilfer derivative and utilize it to derive the hybrid fractional differential inequalities involving $Ψ$-Hilfer derivative. With the assistance of these fractional differential inequalities, we determine the existence of extremal solutions, comparison theorems and uniqueness of the solution. △ Less

Submitted 14 August, 2020; originally announced August 2020.

Comments: 24

arXiv:2006.03651 [pdf, other]

A provably stable neural network Turing Machine

Authors: John Stogin, Ankur Mali, C Lee Giles

Abstract: We introduce a neural stack architecture, including a differentiable parametrized stack operator that approximates stack push and pop operations for suitable choices of parameters that explicitly represents a stack. We prove the stability of this stack architecture: after arbitrarily many stack operations, the state of the neural stack still closely resembles the state of the discrete stack. Using… ▽ More We introduce a neural stack architecture, including a differentiable parametrized stack operator that approximates stack push and pop operations for suitable choices of parameters that explicitly represents a stack. We prove the stability of this stack architecture: after arbitrarily many stack operations, the state of the neural stack still closely resembles the state of the discrete stack. Using the neural stack with a recurrent neural network, we introduce a neural network Pushdown Automaton (nnPDA) and prove that nnPDA with finite/bounded neurons and time can simulate any PDA. Furthermore, we extend our construction and propose new architecture neural state Turing Machine (nnTM). We prove that differentiable nnTM with bounded neurons can simulate Turing Machine (TM) in real-time. Just like the neural stack, these architectures are also stable. Finally, we extend our construction to show that differentiable nnTM is equivalent to Universal Turing Machine (UTM) and can simulate any TM with only \textbf{seven finite/bounded precision} neurons. This work provides a new theoretical bound for the computational capability of bounded precision RNNs augmented with memory. △ Less

Submitted 18 September, 2022; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: 28 pages, 2 figures

arXiv:2004.07623 [pdf, other]

Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Clyde Lee Giles

Abstract: Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. Despite success in applications such as machine translation and voice recognition, these stateful models have several critical shortcomings. Specifically, RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and t… ▽ More Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. Despite success in applications such as machine translation and voice recognition, these stateful models have several critical shortcomings. Specifically, RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems. For example, RNNs struggle in recognizing complex context free languages (CFLs), never reaching 100% accuracy on training. One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack. However, differentiable memories in prior work have neither been extensively studied on CFLs nor tested on sequences longer than those seen in training. The few efforts that have studied them have shown that continuous differentiable memory structures yield poor generalization for complex CFLs, making the RNN less interpretable. In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms that ensure that the model learns to properly balance the use of its latent states with external memory. Our improved RNN models exhibit better generalization performance and are able to classify long strings generated by complex hierarchical context free grammars (CFGs). We evaluate our models on CGGs, including the Dyck languages, as well as on the Penn Treebank language modelling task, and achieve stable, robust performance across these benchmarks. Furthermore, we show that only our memory-augmented networks are capable of retaining memory for a longer duration up to strings of length 160. △ Less

Submitted 22 April, 2020; v1 submitted 4 April, 2020; originally announced April 2020.

Comments: 14 pages, 10 tables

arXiv:2002.03911 [pdf, other]

Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment

Authors: Alexander Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles

Abstract: Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop vario… ▽ More Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop various tricks, such as specialized weight initializations and activation functions, in order to ensure a stable parameter optimization. Our goal is to seek an effective, neuro-biologically-plausible alternative to backprop that can be used to train deep networks. In this paper, we propose a gradient-free learning procedure, recursive local representation alignment, for training large-scale neural architectures. Experiments with residual networks on CIFAR-10 and the large benchmark, ImageNet, show that our algorithm generalizes as well as backprop while converging sooner due to weight updates that are parallelizable and computationally less demanding. This is empirical evidence that a backprop-free algorithm can scale up to larger datasets. △ Less

Submitted 18 September, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

Comments: Further revised submission -- main description of rec-LRA revamped and architecture-agnostic pseudo-code moved to appendix with additional results/derivation updates

arXiv:2001.08479 [pdf, ps, other]

doi 10.1002/mma.6521

Nonlocal Boundary Value Problem for Generalized Hilfer Implicit Fractional Differential Equations

Authors: Ashwini D. Mali, Kishor D. Kucche

Abstract: In this paper, we derive the equivalent fractional integral equation to the nonlinear implicit fractional differential equations involving $\varphi$-Hilfer fractional derivative subject to nonlocal fractional integral boundary conditions. The existence of a solution, Ulam-Hyers, and Ulam-Hyers-Rassias stability has been acquired by means equivalent fractional integral equation. Our investigations… ▽ More In this paper, we derive the equivalent fractional integral equation to the nonlinear implicit fractional differential equations involving $\varphi$-Hilfer fractional derivative subject to nonlocal fractional integral boundary conditions. The existence of a solution, Ulam-Hyers, and Ulam-Hyers-Rassias stability has been acquired by means equivalent fractional integral equation. Our investigations depend on the fixed point theorem due to Krasnoselskii and the Gronwall inequality involving $\varphi$-Riemann--Liouville fractional integral. An example is provided to show the utilization of primary outcomes. △ Less

Submitted 23 January, 2020; originally announced January 2020.

Comments: 24

Journal ref: Mathematical Methods in the Applied Sciences (2020),1-24

arXiv:1911.08478 [pdf, other]

Sibling Neural Estimators: Improving Iterative Image Decoding with Gradient Communication

Authors: Ankur Mali, Alexander G. Ororbia, Clyde Lee Giles

Abstract: For lossy image compression, we develop a neural-based system which learns a nonlinear estimator for decoding from quantized representations. The system links two recurrent networks that \help" each other reconstruct same target image patches using complementary portions of spatial context that communicate via gradient signals. This dual agent system builds upon prior work that proposed the iterat… ▽ More For lossy image compression, we develop a neural-based system which learns a nonlinear estimator for decoding from quantized representations. The system links two recurrent networks that \help" each other reconstruct same target image patches using complementary portions of spatial context that communicate via gradient signals. This dual agent system builds upon prior work that proposed the iterative refinement algorithm for recurrent neural network (RNN)based decoding which improved image reconstruction compared to standard decoding techniques. Our approach, which works with any encoder, neural or non-neural, This system progressively reduces image patch reconstruction error over a fixed number of steps. Experiment with variants of RNN memory cells, with and without future information, find that our model consistently creates lower distortion images of higher perceptual quality compared to other approaches. Specifically, on the Kodak Lossless True Color Image Suite, we observe as much as a 1:64 decibel (dB) gain over JPEG, a 1:46 dB gain over JPEG 2000, a 1:34 dB gain over the GOOG neural baseline, 0:36 over E2E (a modern competitive neural compression model), and 0:37 over a single iterative neural decoder. △ Less

Submitted 19 November, 2019; originally announced November 2019.

Comments: 11 Pages, 2 figures, 1 Table

arXiv:1909.05233 [pdf, other]

The Neural State Pushdown Automata

Authors: Ankur Mali, Alexander Ororbia, C. Lee Giles

Abstract: In order to learn complex grammars, recurrent neural networks (RNNs) require sufficient computational resources to ensure correct grammar recognition. A widely-used approach to expand model capacity would be to couple an RNN to an external memory stack. Here, we introduce a "neural state" pushdown automaton (NSPDA), which consists of a digital stack, instead of an analog one, that is coupled to a… ▽ More In order to learn complex grammars, recurrent neural networks (RNNs) require sufficient computational resources to ensure correct grammar recognition. A widely-used approach to expand model capacity would be to couple an RNN to an external memory stack. Here, we introduce a "neural state" pushdown automaton (NSPDA), which consists of a digital stack, instead of an analog one, that is coupled to a neural network state machine. We empirically show its effectiveness in recognizing various context-free grammars (CFGs). First, we develop the underlying mechanics of the proposed higher order recurrent network and its manipulation of a stack as well as how to stably program its underlying pushdown automaton (PDA) to achieve desired finite-state network dynamics. Next, we introduce a noise regularization scheme for higher-order (tensor) networks, to our knowledge the first of its kind, and design an algorithm for improved incremental learning. Finally, we design a method for inserting grammar rules into a NSPDA and empirically show that this prior knowledge improves its training convergence time by an order of magnitude and, in some cases, leads to better generalization. The NSPDA is also compared to a classical analog stack neural network pushdown automaton (NNPDA) as well as a wide array of first and second-order RNNs with and without external memory, trained using different learning algorithms. Our results show that, for Dyck(2) languages, prior rule-based knowledge is critical for optimization convergence and for ensuring generalization to longer sequences at test time. We observe that many RNNs with and without memory, but no prior knowledge, fail to converge and generalize poorly on CFGs. △ Less

Submitted 19 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

Comments: 10 pages, 7 Table, 1 figure

arXiv:1907.05849 [pdf]

doi 10.1103/PhysRevLett.124.054502

Direct Measurement of the Elastohydrodynamic Lift Force at the Nanoscale

Authors: Zaicheng Zhang, Vincent Bertin, Muhammad Arshad, Elie Raphael, Thomas Salez, Abdelhamid Maali

Abstract: We present the first direct measurement of the elastohydrodynamic lift force acting on a sphere moving within a viscous liquid, near and along a soft substrate under nanometric confinement. Using atomic force microscopy, the lift force is probed as a function of the gap size, for various driving velocities, viscosities, and stiffnesses. The force increases as the gap is reduced and shows a saturat… ▽ More We present the first direct measurement of the elastohydrodynamic lift force acting on a sphere moving within a viscous liquid, near and along a soft substrate under nanometric confinement. Using atomic force microscopy, the lift force is probed as a function of the gap size, for various driving velocities, viscosities, and stiffnesses. The force increases as the gap is reduced and shows a saturation at small gap. The results are in excellent agreement with scaling arguments and a quantitative model developed from the soft lubrication theory, in linear elasticity, and for small compliances. For larger compliances, or equivalently for smaller confinement length scales, an empirical scaling law for the observed saturation of the lift force is given and discussed. △ Less

Submitted 6 November, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

Journal ref: Phys. Rev. Lett. 124, 054502 (2020)

arXiv:1905.10696 [pdf, other]

Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

Authors: Alexander Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles

Abstract: In lifelong learning systems based on artificial neural networks, one of the biggest obstacles is the inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points… ▽ More In lifelong learning systems based on artificial neural networks, one of the biggest obstacles is the inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the popular back-propagation of errors. Grounded in the neurocognitive theory of predictive processing, our model adapts synapses in a biologically-plausible fashion while another neural system learns to direct and control this cortex-like structure, mimicking some of the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting compared to standard neural models, outperforming a swath of previously proposed methods, including rehearsal/data buffer-based methods, on both standard (SplitMNIST, Split Fashion MNIST, etc.) and custom benchmarks even though it is trained in a stream-like fashion. Our work offers evidence that emulating mechanisms in real neuronal systems, e.g., local learning, lateral competition, can yield new directions and possibilities for tackling the grand challenge of lifelong machine learning. △ Less

Submitted 14 August, 2022; v1 submitted 25 May, 2019; originally announced May 2019.

Comments: Updated revision, additional baseline results, and expanded appendix (includes derivation from total discrepancy/variational free energy)

arXiv:1810.07411 [pdf, other]

Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations

Authors: Alexander Ororbia, Ankur Mali, C. Lee Giles, Daniel Kifer

Abstract: Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not per… ▽ More Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not permit the use of non-differentiable activation functions and is inherently sequential, making parallelization of the underlying training process difficult. Here, we propose the Parallel Temporal Neural Coding Network (P-TNCN), a biologically inspired model trained by the learning algorithm we call Local Representation Alignment. It aims to resolve the difficulties and problems that plague recurrent networks trained by back-propagation through time. The architecture requires neither unrolling in time nor the derivatives of its internal activation functions. We compare our model and learning procedure to other back-propagation through time alternatives (which also tend to be computationally expensive), including real-time recurrent learning, echo state networks, and unbiased online recurrent optimization. We show that it outperforms these on sequence modeling benchmarks such as Bouncing MNIST, a new benchmark we denote as Bouncing NotMNIST, and Penn Treebank. Notably, our approach can in some instances outperform full back-propagation through time as well as variants such as sparse attentive back-tracking. Significantly, the hidden unit correction phase of P-TNCN allows it to adapt to new datasets even if its synaptic weights are held fixed (zero-shot adaptation) and facilitates retention of prior generative knowledge when faced with a task sequence. We present results that show the P-TNCN's ability to conduct zero-shot adaptation and online continual sequence modeling. △ Less

Submitted 10 August, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

Comments: Important revisions made throughout (additional items/results added, including a complexity analysis)

arXiv:1809.03036 [pdf, ps, other]

A Neural Temporal Model for Human Motion Prediction

Authors: Anand Gopalakrishnan, Ankur Mali, Dan Kifer, C. Lee Giles, Alexander G. Ororbia

Abstract: We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work in short-term prediction and requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids in generating planned trajectories… ▽ More We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work in short-term prediction and requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids in generating planned trajectories, 2) a simple set of easily computable features that integrate derivative information, and 3) a novel multi-objective loss function that helps the model to slowly progress from simple next-step prediction to the harder task of multi-step, closed-loop prediction. Our results demonstrate that these innovations improve the modeling of long-term motion trajectories. Finally, we propose a novel metric, called Normalized Power Spectrum Similarity (NPSS), to evaluate the long-term predictive ability of motion synthesis models, complementing the popular mean-squared error (MSE) measure of Euler joint angles over time. We conduct a user study to determine if the proposed NPSS correlates with human evaluation of long-term motion more strongly than MSE and find that it indeed does. We release code and additional results (visualizations) for this paper at: https://github.com/cr7anand/neural_temporal_models △ Less

Submitted 22 November, 2019; v1 submitted 9 September, 2018; originally announced September 2018.

Comments: accepted to cvpr 2019

Journal ref: In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12116-12125. 2019

arXiv:1808.01608 [pdf, ps, other]

doi 10.1007/s40314-019-0833-5

On the Nonlinear $Ψ$-Hilfer Fractional Differential Equations

Authors: Kishor D. Kucche, Ashwini D. Mali, J. Vanterler da C. Sousa

Abstract: We consider the nonlinear Cauchy problem for $ Ψ$- Hilfer fractional differential equations and investigate the existence, interval of existence and uniqueness of solution in the weighted space of functions. The continuous dependence of solutions on initial conditions is proved via Weissinger fixed point theorem. Picard's successive approximation method has been developed to solve nonlinear Cauchy… ▽ More We consider the nonlinear Cauchy problem for $ Ψ$- Hilfer fractional differential equations and investigate the existence, interval of existence and uniqueness of solution in the weighted space of functions. The continuous dependence of solutions on initial conditions is proved via Weissinger fixed point theorem. Picard's successive approximation method has been developed to solve nonlinear Cauchy problem for differential equations with $ Ψ$- Hilfer fractional derivative and an estimation have been obtained for the error bound. Further, by Picard's successive approximation, we derive the representation formula for the solution of linear Cauchy problem for $ Ψ$-Hilfer fractional differential equation with constant coefficient and variable coefficient in terms of Mittag-Leffler function and Generalized (Kilbas-Saigo) Mittag-Leffler function. △ Less

Submitted 28 September, 2018; v1 submitted 5 August, 2018; originally announced August 2018.

Comments: 27 Pages

Journal ref: Comp. Appl. Math. 38, 73 (2019)

arXiv:1805.11703 [pdf, other]

Biologically Motivated Algorithms for Propagating Local Target Representations

Authors: Alexander G. Ororbia, Ankur Mali

Abstract: Finding biologically plausible alternatives to back-propagation of errors is a fundamentally important challenge in artificial neural network research. In this paper, we propose a learning algorithm called error-driven Local Representation Alignment (LRA-E), which has strong connections to predictive coding, a theory that offers a mechanistic way of describing neurocomputational machinery. In addi… ▽ More Finding biologically plausible alternatives to back-propagation of errors is a fundamentally important challenge in artificial neural network research. In this paper, we propose a learning algorithm called error-driven Local Representation Alignment (LRA-E), which has strong connections to predictive coding, a theory that offers a mechanistic way of describing neurocomputational machinery. In addition, we propose an improved variant of Difference Target Propagation, another procedure that comes from the same family of algorithms as LRA-E. We compare our procedures to several other biologically-motivated algorithms, including two feedback alignment algorithms and Equilibrium Propagation. In two benchmarks, we find that both of our proposed algorithms yield stable performance and strong generalization compared to other competing back-propagation alternatives when training deeper, highly nonlinear networks, with LRA-E performing the best overall. △ Less

Submitted 15 November, 2018; v1 submitted 26 May, 2018; originally announced May 2018.

Comments: Final version for AAAI (accepted paper)

arXiv:1805.11546 [pdf, other]

Like a Baby: Visually Situated Neural Language Acquisition

Authors: Alexander G. Ororbia, Ankur Mali, Matthew A. Kelly, David Reitter

Abstract: We examine the benefits of visual context in training neural language models to perform next-word prediction. A multi-modal neural architecture is introduced that outperform its equivalent trained on language alone with a 2\% decrease in perplexity, even when no visual context is available at test. Fine-tuning the embeddings of a pre-trained state-of-the-art bidirectional language model (BERT) in… ▽ More We examine the benefits of visual context in training neural language models to perform next-word prediction. A multi-modal neural architecture is introduced that outperform its equivalent trained on language alone with a 2\% decrease in perplexity, even when no visual context is available at test. Fine-tuning the embeddings of a pre-trained state-of-the-art bidirectional language model (BERT) in the language modeling framework yields a 3.5\% improvement. The advantage for training with visual context when testing without is robust across different languages (English, German and Spanish) and different models (GRU, LSTM, $Δ$-RNN, as well as those that use BERT embeddings). Thus, language models perform better when they learn like a baby, i.e, in a multi-modal environment. This finding is compatible with the theory of situated cognition: language is inseparable from its physical context. △ Less

Submitted 4 June, 2019; v1 submitted 29 May, 2018; originally announced May 2018.

Comments: Final submission (camera-ready), accepted to ACL 2019

arXiv:1803.05863 [pdf, other]

Learned Neural Iterative Decoding for Lossy Image Compression Systems

Authors: Alexander G. Ororbia, Ankur Mali, Jian Wu, Scott O'Connell, David Miller, C. Lee Giles

Abstract: For lossy image compression systems, we develop an algorithm, iterative refinement, to improve the decoder's reconstruction compared to standard decoding techniques. Specifically, we propose a recurrent neural network approach for nonlinear, iterative decoding. Our decoder, which works with any encoder, employs self-connected memory units that make use of causal and non-causal spatial context info… ▽ More For lossy image compression systems, we develop an algorithm, iterative refinement, to improve the decoder's reconstruction compared to standard decoding techniques. Specifically, we propose a recurrent neural network approach for nonlinear, iterative decoding. Our decoder, which works with any encoder, employs self-connected memory units that make use of causal and non-causal spatial context information to progressively reduce reconstruction error over a fixed number of steps. We experiment with variants of our estimator and find that iterative refinement consistently creates lower distortion images of higher perceptual quality compared to other approaches. Specifically, on the Kodak Lossless True Color Image Suite, we observe as much as a 0.871 decibel (dB) gain over JPEG, a 1.095 dB gain over JPEG 2000, and a 0.971 dB gain over a competitive neural model. △ Less

Submitted 10 November, 2018; v1 submitted 15 March, 2018; originally announced March 2018.

Comments: Vastly updated version, now includes JP2

arXiv:1803.01834 [pdf, other]

Conducting Credit Assignment by Aligning Local Representations

Authors: Alexander G. Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles

Abstract: Using back-propagation and its variants to train deep networks is often problematic for new users. Issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often make networks difficult to train, especially when users are experimenting with new architectures. Here, we present Local Representation Alignment (LRA), a training procedure that is… ▽ More Using back-propagation and its variants to train deep networks is often problematic for new users. Issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often make networks difficult to train, especially when users are experimenting with new architectures. Here, we present Local Representation Alignment (LRA), a training procedure that is much less sensitive to bad initializations, does not require modifications to the network architecture, and can be adapted to networks with highly nonlinear and discrete-valued activation functions. Furthermore, we show that one variation of LRA can start with a null initialization of network weights and still successfully train networks with a wide variety of nonlinearities, including tanh, ReLU-6, softplus, signum and others that may draw their inspiration from biology. A comprehensive set of experiments on MNIST and the much harder Fashion MNIST data sets show that LRA can be used to train networks robustly and effectively, succeeding even when back-propagation fails and outperforming other alternative learning algorithms, such as target propagation and feedback alignment. △ Less

Submitted 12 July, 2018; v1 submitted 5 March, 2018; originally announced March 2018.

Comments: Full document revision/overhaul, new results/analyses, new diagrams, addition of appendices

arXiv:1703.03565 [pdf]

doi 10.1103/PhysRevLett.118.084501

Visco-elastic drag forces and crossover from no-slip to slip boundary conditions for flow near air-water interfaces

Authors: Abdelhamid Maali, Rodolphe Boisgard, Hamza Chraibi, Zaicheng Zhang, Hamid Kellay, Alois Würger

Abstract: The "free" water surface is generally prone to contamination with surface impurities be they surfactants, particles or other surface active agents. The presence of such impurities can modify flow boundary near such interfaces in a drastic manner. Here we show that vibrating a small sphere mounted on an AFM cantilever near a gas bubble immersed in water, is an excellent probe of surface contaminati… ▽ More The "free" water surface is generally prone to contamination with surface impurities be they surfactants, particles or other surface active agents. The presence of such impurities can modify flow boundary near such interfaces in a drastic manner. Here we show that vibrating a small sphere mounted on an AFM cantilever near a gas bubble immersed in water, is an excellent probe of surface contamination. Both viscous and elastic forces are exerted by an air-water interface on the vibrating sphere even when very low doses of contaminants are present. The viscous drag forces show a cross-over from no-slip to slip boundary conditions while the elastic forces show a nontrivial variation as the vibration frequency changes. We provide a simple model to rationalize these results and propose a simple way of evaluating the concentration of such surface impurities. △ Less

Submitted 10 March, 2017; originally announced March 2017.

Comments: 11 pages, 5 figures

Journal ref: Physical Review Letters 118, 084501 (2017)

arXiv:1503.03283 [pdf, ps, other]

On Acyclic Edge-Coloring of Complete Bipartite Graphs

Authors: Ayineedi Venkateswarlu, Santanu Sarkar, A. Sai Mali

Abstract: An acyclic edge-coloring of a graph is a proper edge-coloring without bichromatic ($2$-colored) cycles. The acyclic chromatic index of a graph $G$, denoted by $a'(G)$, is the least integer $k$ such that $G$ admits an acyclic edge-coloring using $k$ colors. Let $Δ= Δ(G)$ denote the maximum degree of a vertex in a graph $G$. A complete bipartite graph with $n$ vertices on each side is denoted by… ▽ More An acyclic edge-coloring of a graph is a proper edge-coloring without bichromatic ($2$-colored) cycles. The acyclic chromatic index of a graph $G$, denoted by $a'(G)$, is the least integer $k$ such that $G$ admits an acyclic edge-coloring using $k$ colors. Let $Δ= Δ(G)$ denote the maximum degree of a vertex in a graph $G$. A complete bipartite graph with $n$ vertices on each side is denoted by $K_{n,n}$. Basavaraju, Chandran and Kummini proved that $a'(K_{n,n}) \ge n+2 = Δ+ 2$ when $n$ is odd. Basavaraju and Chandran provided an acyclic edge-coloring of $K_{p,p}$ using $p+2$ colors and thus establishing $a'(K_{p,p}) = p+2 = Δ+ 2$ when $p$ is an odd prime. The main tool in their approach is perfect $1$-factorization of $K_{p,p}$. Recently, following their approach, Venkateswarlu and Sarkar have shown that $K_{2p-1,2p-1}$ admits an acyclic edge-coloring using $2p+1$ colors which implies that $a'(K_{2p-1,2p-1}) = 2p+1 = Δ+ 2$, where $p$ is an odd prime. In this paper, we generalize this approach and present a general framework to possibly get an acyclic edge-coloring of $K_{n,n}$ which possess a perfect $1$-factorization using $n+2 = Δ+2$ colors. In this general framework, we show that $K_{p^2,p^2}$ admits an acyclic edge-coloring using $p^2+2$ colors and thus establishing $a'(K_{p^2,p^2}) = p^2+2 = Δ+ 2$ when $p\ge 5$ is an odd prime. △ Less

Submitted 11 March, 2015; originally announced March 2015.

Comments: 17 pages, 10 figures

arXiv:1310.5985 [pdf]

Adaptive Push-Then-Pull Gossip Algorithm for Scale-free Networks

Authors: Ruchir Gupta, Abhijeet C. Maali, Yatindra Nath Singh

Abstract: Real life networks are generally modelled as scale free networks. Information diffusion in such networks in decentralised environment is a difficult and resource consuming affair. Gossip algorithms have come up as a good solution to this problem. In this paper, we have proposed Adaptive First Push Then Pull gossip algorithm. We show that algorithm works with minimum cost when the transition round… ▽ More Real life networks are generally modelled as scale free networks. Information diffusion in such networks in decentralised environment is a difficult and resource consuming affair. Gossip algorithms have come up as a good solution to this problem. In this paper, we have proposed Adaptive First Push Then Pull gossip algorithm. We show that algorithm works with minimum cost when the transition round to switch from Adaptive Push to Adaptive Pull is close to Round(log(N)). Furthermore, we compare our algorithm with Push, Pull and First Push Then Pull and show that the proposed algorithm is the most cost efficient in Scale Free networks. △ Less

Submitted 22 October, 2013; originally announced October 2013.

Showing 1–46 of 46 results for author: Mali, A