-
Direct measurement of the viscocapillary lift force near a liquid interface
Authors:
Hao Zhang,
Zaicheng Zhang,
Aditya Jha,
Yacine Amarouchene,
Thomas Salez,
Thomas Guérin,
Chaouqi Misbah,
Abdelhamid Maali
Abstract:
Lift force of viscous origin is widespread across disciplines, from mechanics to biology. Here, we present the first direct measurement of the lift force acting on a particle moving in a viscous fluid along the liquid interface that separates two liquids. The force arises from the coupling between the viscous flow induced by the particle motion and the capillary deformation of the interface. The m…
▽ More
Lift force of viscous origin is widespread across disciplines, from mechanics to biology. Here, we present the first direct measurement of the lift force acting on a particle moving in a viscous fluid along the liquid interface that separates two liquids. The force arises from the coupling between the viscous flow induced by the particle motion and the capillary deformation of the interface. The measurements show that the lift force increases as the distance between the sphere and the interface decreases, reaching saturation at small distances. The experimental results are in good agreement with the model and numerical calculation developed within the framework of the soft lubrication theory.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Investigating Symbolic Capabilities of Large Language Models
Authors:
Neisarg Dave,
Daniel Kifer,
C. Lee Giles,
Ankur Mali
Abstract:
Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to b…
▽ More
Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to bridge this gap by rigorously evaluating LLMs on a series of symbolic tasks, such as addition, multiplication, modulus arithmetic, numerical precision, and symbolic counting. Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks. The assessment framework is anchored in Chomsky's Hierarchy, providing a robust measure of the computational abilities of these models. The evaluation employs minimally explained prompts alongside the zero-shot Chain of Thoughts technique, allowing models to navigate the solution process autonomously. The findings reveal a significant decline in LLMs' performance on context-free and context-sensitive symbolic tasks as the complexity, represented by the number of symbols, increases. Notably, even the fine-tuned GPT3.5 exhibits only marginal improvements, mirroring the performance trends observed in other models. Across the board, all models demonstrated a limited generalization ability on these symbol-intensive tasks. This research underscores LLMs' challenges with increasing symbolic complexity and highlights the need for specialized training, memory and architectural adjustments to enhance their proficiency in symbol-based reasoning tasks.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
A Review of Neuroscience-Inspired Machine Learning
Authors:
Alexander Ororbia,
Ankur Mali,
Adam Kohan,
Beren Millidge,
Tommaso Salvatori
Abstract:
One major criticism of deep learning centers around the biological implausibility of the credit assignment schema used for learning -- backpropagation of errors. This implausibility translates into practical limitations, spanning scientific fields, including incompatibility with hardware and non-differentiable implementations, thus leading to expensive energy requirements. In contrast, biologicall…
▽ More
One major criticism of deep learning centers around the biological implausibility of the credit assignment schema used for learning -- backpropagation of errors. This implausibility translates into practical limitations, spanning scientific fields, including incompatibility with hardware and non-differentiable implementations, thus leading to expensive energy requirements. In contrast, biologically plausible credit assignment is compatible with practically any learning condition and is energy-efficient. As a result, it accommodates hardware and scientific modeling, e.g. learning with physical systems and non-differentiable behavior. Furthermore, it can lead to the development of real-time, adaptive neuromorphic processing systems. In addressing this problem, an interdisciplinary branch of artificial intelligence research that lies at the intersection of neuroscience, cognitive science, and machine learning has emerged. In this paper, we survey several vital algorithms that model bio-plausible rules of credit assignment in artificial neural networks, discussing the solutions they provide for different scientific fields as well as their advantages on CPUs, GPUs, and novel implementations of neuromorphic hardware. We conclude by discussing the future challenges that will need to be addressed in order to make such algorithms more useful in practical applications.
△ Less
Submitted 16 February, 2024;
originally announced March 2024.
-
Neuro-mimetic Task-free Unsupervised Online Learning with Continual Self-Organizing Maps
Authors:
Hitesh Vaidya,
Travis Desell,
Ankur Mali,
Alexander Ororbia
Abstract:
An intelligent system capable of continual learning is one that can process and extract knowledge from potentially infinitely long streams of pattern vectors. The major challenge that makes crafting such a system difficult is known as catastrophic forgetting - an agent, such as one based on artificial neural networks (ANNs), struggles to retain previously acquired knowledge when learning from new…
▽ More
An intelligent system capable of continual learning is one that can process and extract knowledge from potentially infinitely long streams of pattern vectors. The major challenge that makes crafting such a system difficult is known as catastrophic forgetting - an agent, such as one based on artificial neural networks (ANNs), struggles to retain previously acquired knowledge when learning from new samples. Furthermore, ensuring that knowledge is preserved for previous tasks becomes more challenging when input is not supplemented with task boundary information. Although forgetting in the context of ANNs has been studied extensively, there still exists far less work investigating it in terms of unsupervised architectures such as the venerable self-organizing map (SOM), a neural model often used in clustering and dimensionality reduction. While the internal mechanisms of SOMs could, in principle, yield sparse representations that improve memory retention, we observe that, when a fixed-size SOM processes continuous data streams, it experiences concept drift. In light of this, we propose a generalization of the SOM, the continual SOM (CSOM), which is capable of online unsupervised learning under a low memory budget. Our results, on benchmarks including MNIST, Kuzushiji-MNIST, and Fashion-MNIST, show almost a two times increase in accuracy, and CIFAR-10 demonstrates a state-of-the-art result when tested on (online) unsupervised class incremental learning setting.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Stable and Robust Deep Learning By Hyperbolic Tangent Exponential Linear Unit (TeLU)
Authors:
Alfredo Fernandez,
Ankur Mali
Abstract:
In this paper, we introduce the Hyperbolic Tangent Exponential Linear Unit (TeLU), a novel neural network activation function, represented as $f(x) = x{\cdot}tanh(e^x)$. TeLU is designed to overcome the limitations of conventional activation functions like ReLU, GELU, and Mish by addressing the vanishing and, to an extent, the exploding gradient problems. Our theoretical analysis and empirical ass…
▽ More
In this paper, we introduce the Hyperbolic Tangent Exponential Linear Unit (TeLU), a novel neural network activation function, represented as $f(x) = x{\cdot}tanh(e^x)$. TeLU is designed to overcome the limitations of conventional activation functions like ReLU, GELU, and Mish by addressing the vanishing and, to an extent, the exploding gradient problems. Our theoretical analysis and empirical assessments reveal that TeLU outperforms existing activation functions in stability and robustness, effectively adjusting activation outputs' mean towards zero for enhanced training stability and convergence. Extensive evaluations against popular activation functions (ReLU, GELU, SiLU, Mish, Logish, Smish) across advanced architectures, including Resnet-50, demonstrate TeLU's lower variance and superior performance, even under hyperparameter conditions optimized for other functions. In large-scale tests with challenging datasets like CIFAR-10, CIFAR-100, and TinyImageNet, encompassing 860 scenarios, TeLU consistently showcased its effectiveness, positioning itself as a potential new standard for neural network activation functions, boosting stability and performance in diverse deep learning applications.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Stability Analysis of Various Symbolic Rule Extraction Methods from Recurrent Neural Network
Authors:
Neisarg Dave,
Daniel Kifer,
C. Lee Giles,
Ankur Mali
Abstract:
This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RN…
▽ More
This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RNN, and MIRNN. The observations from our experiments establish the superior performance of O2RNN and quantization-based rule extraction over others. $L^{*}$, primarily proposed for regular grammars, performs similarly to quantization methods for Tomita languages when neural networks are perfectly trained. However, for partially trained RNNs, $L^{*}$ shows instability in the number of states in DFA, e.g., for Tomita 5 and Tomita 6 languages, $L^{*}$ produced more than $100$ states. In contrast, quantization methods result in rules with number of states very close to ground truth DFA. Among RNN cells, O2RNN produces stable DFA consistently compared to other cells. For Dyck Languages, we observe that although GRU outperforms other RNNs in network performance, the DFA extracted by O2RNN has higher performance and better stability. The stability is computed as the standard deviation of accuracy on test sets on networks trained across $10$ seeds. On Dyck Languages, quantization methods outperformed $L^{*}$ with better stability in accuracy and the number of states. $L^{*}$ often showed instability in accuracy in the order of $16\% - 22\%$ for GRU and MIRNN while deviation for quantization methods varied in $5\% - 15\%$. In many instances with LSTM and GRU, DFA's extracted by $L^{*}$ even failed to beat chance accuracy ($50\%$), while those extracted by quantization method had standard deviation in the $7\%-17\%$ range. For O2RNN, both rule extraction methods had deviation in the $0.5\% - 3\%$ range.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
On the Computational Complexity and Formal Hierarchy of Second Order Recurrent Neural Networks
Authors:
Ankur Mali,
Alexander Ororbia,
Daniel Kifer,
Lee Giles
Abstract:
Artificial neural networks (ANNs) with recurrence and self-attention have been shown to be Turing-complete (TC). However, existing work has shown that these ANNs require multiple turns or unbounded computation time, even with unbounded precision in weights, in order to recognize TC grammars. However, under constraints such as fixed or bounded precision neurons and time, ANNs without memory are sho…
▽ More
Artificial neural networks (ANNs) with recurrence and self-attention have been shown to be Turing-complete (TC). However, existing work has shown that these ANNs require multiple turns or unbounded computation time, even with unbounded precision in weights, in order to recognize TC grammars. However, under constraints such as fixed or bounded precision neurons and time, ANNs without memory are shown to struggle to recognize even context-free languages. In this work, we extend the theoretical foundation for the $2^{nd}$-order recurrent network ($2^{nd}$ RNN) and prove there exists a class of a $2^{nd}$ RNN that is Turing-complete with bounded time. This model is capable of directly encoding a transition table into its recurrent weights, enabling bounded time computation and is interpretable by design. We also demonstrate that $2$nd order RNNs, without memory, under bounded weights and time constraints, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars. We provide an upper bound and a stability analysis on the maximum number of neurons required by $2$nd order RNNs to recognize any class of regular grammar. Extensive experiments on the Tomita grammars support our findings, demonstrating the importance of tensor connections in crafting computationally efficient RNNs. Finally, we show $2^{nd}$ order RNNs are also interpretable by extraction and can extract state machines with higher success rates as compared to first-order RNNs. Our results extend the theoretical foundations of RNNs and offer promising avenues for future explainable AI research.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
On the Tensor Representation and Algebraic Homomorphism of the Neural State Turing Machine
Authors:
Ankur Mali,
Alexander Ororbia,
Daniel Kifer,
Lee Giles
Abstract:
Recurrent neural networks (RNNs) and transformers have been shown to be Turing-complete, but this result assumes infinite precision in their hidden representations, positional encodings for transformers, and unbounded computation time in general. In practical applications, however, it is crucial to have real-time models that can recognize Turing complete grammars in a single pass. To address this…
▽ More
Recurrent neural networks (RNNs) and transformers have been shown to be Turing-complete, but this result assumes infinite precision in their hidden representations, positional encodings for transformers, and unbounded computation time in general. In practical applications, however, it is crucial to have real-time models that can recognize Turing complete grammars in a single pass. To address this issue and to better understand the true computational power of artificial neural networks (ANNs), we introduce a new class of recurrent models called the neural state Turing machine (NSTM). The NSTM has bounded weights and finite-precision connections and can simulate any Turing Machine in real-time. In contrast to prior work that assumes unbounded time and precision in weights, to demonstrate equivalence with TMs, we prove that a $13$-neuron bounded tensor RNN, coupled with third-order synapses, can model any TM class in real-time. Furthermore, under the Markov assumption, we provide a new theoretical bound for a non-recurrent network augmented with memory, showing that a tensor feedforward network with $25$th-order finite precision weights is equivalent to a universal TM.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Brain-Inspired Computational Intelligence via Predictive Coding
Authors:
Tommaso Salvatori,
Ankur Mali,
Christopher L. Buckley,
Thomas Lukasiewicz,
Rajesh P. N. Rao,
Karl Friston,
Alexander Ororbia
Abstract:
Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying unc…
▽ More
Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Unsteady drag force on an immersed sphere oscillating near a wall
Authors:
Zaicheng Zhang,
Vincent Bertin,
Martin Essink,
Hao Zhang,
Nicolas Fares,
Zaiyi Shen,
Thomas Bickel,
Thomas Salez,
Abdelhamid Maali
Abstract:
The unsteady hydrodynamic drag exerted on an oscillating sphere near a planar wall is addressed experimentally, theoretically, and numerically. The experiments are performed by using colloidal-probe Atomic Force Microscopy (AFM) in thermal noise mode. The natural resonance frequencies and quality factors are extracted from the measurement of the power spectrum density of the probe oscillation for…
▽ More
The unsteady hydrodynamic drag exerted on an oscillating sphere near a planar wall is addressed experimentally, theoretically, and numerically. The experiments are performed by using colloidal-probe Atomic Force Microscopy (AFM) in thermal noise mode. The natural resonance frequencies and quality factors are extracted from the measurement of the power spectrum density of the probe oscillation for a broad range of gap distances and Womersley numbers. The shift in the natural resonance frequency of the colloidal probe as the probe goes close to a solid wall infers the wall-induced variations of the effective mass of the probe. Interestingly, a crossover from a positive to a negative shift is observed as the Womersley number increases. In order to rationalize the results, the confined unsteady Stokes equation is solved numerically using a finite-element method, as well as asymptotic calculations.The in-phase and out-of-phase terms of the hydrodynamic drag acting on the sphere are obtained and agree well to the experimental results. All together, the experimental, theoretical, and numerical results show that the hydrodynamic force felt by an immersed sphere oscillating near a wall is highly dependent on the Womersley number.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
The Predictive Forward-Forward Algorithm
Authors:
Alexander Ororbia,
Ankur Mali
Abstract:
We propose the predictive forward-forward (PFF) algorithm for conducting credit assignment in neural systems. Specifically, we design a novel, dynamic recurrent neural system that learns a directed generative circuit jointly and simultaneously with a representation circuit. Notably, the system integrates learnable lateral competition, noise injection, and elements of predictive coding, an emerging…
▽ More
We propose the predictive forward-forward (PFF) algorithm for conducting credit assignment in neural systems. Specifically, we design a novel, dynamic recurrent neural system that learns a directed generative circuit jointly and simultaneously with a representation circuit. Notably, the system integrates learnable lateral competition, noise injection, and elements of predictive coding, an emerging and viable neurobiological process theory of cortical function, with the forward-forward (FF) adaptation scheme. Furthermore, PFF efficiently learns to propagate learning signals and updates synapses with forward passes only, eliminating key structural and computational constraints imposed by backpropagation-based schemes. Besides computational advantages, the PFF process could prove useful for understanding the learning mechanisms behind biological neurons that use local signals despite missing feedback connections. We run experiments on image data and demonstrate that the PFF procedure works as well as backpropagation, offering a promising brain-inspired algorithm for classifying, reconstructing, and synthesizing data patterns.
△ Less
Submitted 2 April, 2023; v1 submitted 4 January, 2023;
originally announced January 2023.
-
Convolutional Neural Generative Coding: Scaling Predictive Coding to Natural Images
Authors:
Alexander Ororbia,
Ankur Mali
Abstract:
In this work, we develop convolutional neural generative coding (Conv-NGC), a generalization of predictive coding to the case of convolution/deconvolution-based computation. Specifically, we concretely implement a flexible neurobiologically-motivated algorithm that progressively refines latent state feature maps in order to dynamically form a more accurate internal representation/reconstruction mo…
▽ More
In this work, we develop convolutional neural generative coding (Conv-NGC), a generalization of predictive coding to the case of convolution/deconvolution-based computation. Specifically, we concretely implement a flexible neurobiologically-motivated algorithm that progressively refines latent state feature maps in order to dynamically form a more accurate internal representation/reconstruction model of natural images. The performance of the resulting sensory processing system is evaluated on complex datasets such as Color-MNIST, CIFAR-10, and Street House View Numbers (SVHN). We study the effectiveness of our brain-inspired model on the tasks of reconstruction and image denoising and find that it is competitive with convolutional auto-encoding systems trained by backpropagation of errors and outperforms them with respect to out-of-distribution reconstruction (including the full 90k CINIC-10 test set).
△ Less
Submitted 5 February, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Like a bilingual baby: The advantage of visually grounding a bilingual language model
Authors:
Khai-Nguyen Nguyen,
Zixin Tang,
Ankur Mali,
Alex Kelly
Abstract:
Unlike most neural language models, humans learn language in a rich, multi-sensory and, often, multi-lingual environment. Current language models typically fail to fully capture the complexities of multilingual language use. We train an LSTM language model on images and captions in English and Spanish from MS-COCO-ES. We find that the visual grounding improves the model's understanding of semantic…
▽ More
Unlike most neural language models, humans learn language in a rich, multi-sensory and, often, multi-lingual environment. Current language models typically fail to fully capture the complexities of multilingual language use. We train an LSTM language model on images and captions in English and Spanish from MS-COCO-ES. We find that the visual grounding improves the model's understanding of semantic similarity both within and across languages and improves perplexity. However, we find no significant advantage of visual grounding for abstract words. Our results provide additional evidence of the advantages of visually grounded language models and point to the need for more naturalistic language data from multilingual speakers and multilingual datasets with perceptual grounding.
△ Less
Submitted 13 February, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems
Authors:
Alexander Ororbia,
Ankur Mali
Abstract:
In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive…
▽ More
In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive agent system, which we call active predictive coding (ActPC), that balances an internally-generated epistemic signal (meant to encourage intelligent exploration) with an internally-generated instrumental signal (meant to encourage goal-seeking behavior) to ultimately learn how to control various simulated robotic systems as well as a complex robotic arm using a realistic robotics simulator, i.e., the Surreal Robotics Suite, for the block lifting task and can pick-and-place problems. Notably, our experimental results demonstrate that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
A Robust Backpropagation-Free Framework for Images
Authors:
Timothy Zee,
Alexander G. Ororbia,
Ankur Mali,
Ifeoma Nwogu
Abstract:
While current deep learning algorithms have been successful for a wide variety of artificial intelligence (AI) tasks, including those involving structured image data, they present deep neurophysiological conceptual issues due to their reliance on the gradients that are computed by backpropagation of errors (backprop). Gradients are required to obtain synaptic weight adjustments but require knowled…
▽ More
While current deep learning algorithms have been successful for a wide variety of artificial intelligence (AI) tasks, including those involving structured image data, they present deep neurophysiological conceptual issues due to their reliance on the gradients that are computed by backpropagation of errors (backprop). Gradients are required to obtain synaptic weight adjustments but require knowledge of feed-forward activities in order to conduct backward propagation, a biologically implausible process. This is known as the "weight transport problem". Therefore, in this work, we present a more biologically plausible approach towards solving the weight transport problem for image data. This approach, which we name the error kernel driven activation alignment (EKDAA) algorithm, accomplishes through the introduction of locally derived error transmission kernels and error maps. Like standard deep learning networks, EKDAA performs the standard forward process via weights and activation functions; however, its backward error computation involves adaptive error kernels that propagate local error signals through the network. The efficacy of EKDAA is demonstrated by performing visual-recognition tasks on the Fashion MNIST, CIFAR-10 and SVHN benchmarks, along with demonstrating its ability to extract visual features from natural color images. Furthermore, in order to demonstrate its non-reliance on gradient computations, results are presented for an EKDAA trained CNN that employs a non-differentiable activation function.
△ Less
Submitted 5 November, 2023; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Contactless Rheology of Soft Gels over a Broad Frequency Range
Authors:
Zaicheng Zhang,
Muhammad Arshad,
Vincent Bertin,
Samir Almohamad,
Elie Raphaël,
Thomas Salez,
Abdelhamid Maali
Abstract:
We report contactless measurements of the viscoelastic rheological properties of soft gels. The experiments are performed using a colloidal-probe Atomic Force Microscope (AFM) in a liquid environment and in dynamic mode. The mechanical response is measured as a function of the liquid gap thickness for different oscillation frequencies. Our measurements reveal an elastohydrodynamic (EHD) coupling b…
▽ More
We report contactless measurements of the viscoelastic rheological properties of soft gels. The experiments are performed using a colloidal-probe Atomic Force Microscope (AFM) in a liquid environment and in dynamic mode. The mechanical response is measured as a function of the liquid gap thickness for different oscillation frequencies. Our measurements reveal an elastohydrodynamic (EHD) coupling between the flow induced by the probe oscillation and the viscoelastic deformation of the gels. The data are quantitatively described by a viscoelastic lubrication model. The frequency-dependent storage and loss moduli of the polydimethylsiloxane (PDMS) gels are extracted from fits of the data to the model and are in good agreement with the Chasset--Thirion law. Our results demonstrate that contactless colloidal-probe methods are powerful tools that can be used for probing soft interfaces finely over a wide range of frequencies.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder
Authors:
Ankur Mali,
Alexander Ororbia,
Daniel Kifer,
Lee Giles
Abstract:
Recent advances in deep learning have led to superhuman performance across a variety of applications. Recently, these methods have been successfully employed to improve the rate-distortion performance in the task of image compression. However, current methods either use additional post-processing blocks on the decoder end to improve compression or propose an end-to-end compression scheme based on…
▽ More
Recent advances in deep learning have led to superhuman performance across a variety of applications. Recently, these methods have been successfully employed to improve the rate-distortion performance in the task of image compression. However, current methods either use additional post-processing blocks on the decoder end to improve compression or propose an end-to-end compression scheme based on heuristics. For the majority of these, the trained deep neural networks (DNNs) are not compatible with standard encoders and would be difficult to deply on personal computers and cellphones. In light of this, we propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends, an approach we call Neural JPEG. We propose frequency domain pre-editing and post-editing methods to optimize the distribution of the DCT coefficients at both encoder and decoder ends in order to improve the standard compression (JPEG) method. Moreover, we design and integrate a scheme for jointly learning quantization tables within this hybrid neural compression framework.Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics, such as PSNR and MS-SSIM, and generates visually appealing images with better color retention quality.
△ Less
Submitted 31 January, 2022; v1 submitted 27 January, 2022;
originally announced January 2022.
-
An Empirical Analysis of Recurrent Learning Algorithms In Neural Lossy Image Compression Systems
Authors:
Ankur Mali,
Alexander Ororbia,
Daniel Kifer,
Lee Giles
Abstract:
Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark. However, they are slow to train (due to backprop-through-time) and, to the best of our knowledge, have not been systematically evaluated on a large variety of datasets. In this paper, we perform the first large-scale comparison of recent state-of-the-ar…
▽ More
Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark. However, they are slow to train (due to backprop-through-time) and, to the best of our knowledge, have not been systematically evaluated on a large variety of datasets. In this paper, we perform the first large-scale comparison of recent state-of-the-art hybrid neural compression algorithms, while exploring the effects of alternative training strategies (when applicable). The hybrid recurrent neural decoder is a former state-of-the-art model (recently overtaken by a Google model) that can be trained using backprop-through-time (BPTT) or with alternative algorithms like sparse attentive backtracking (SAB), unbiased online recurrent optimization (UORO), and real-time recurrent learning (RTRL). We compare these training alternatives along with the Google models (GOOG and E2E) on 6 benchmark datasets. Surprisingly, we found that the model trained with SAB performs better (outperforming even BPTT), resulting in faster convergence and a better peak signal-to-noise ratio.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Electroviscous drag on squeezing motion in sphere-plane geometry
Authors:
Marcela Rodriguez Matus,
Zaicheng Zhang,
Zouhir Benrahla,
Arghya Majee,
Abdelhamid Maali,
Alois Würger
Abstract:
Theoretically and experimentally, we study electroviscous phenomena resulting from charge-flow coupling in a nanoscale capillary. Our theoretical approach relies on Poisson-Boltzmann mean-field theory and on coupled linear relations for charge and hydrodynamic flows, including electro-osmosis and charge advection. With respect to the unperturbed Poiseuille flow, we define an electroviscous couplin…
▽ More
Theoretically and experimentally, we study electroviscous phenomena resulting from charge-flow coupling in a nanoscale capillary. Our theoretical approach relies on Poisson-Boltzmann mean-field theory and on coupled linear relations for charge and hydrodynamic flows, including electro-osmosis and charge advection. With respect to the unperturbed Poiseuille flow, we define an electroviscous coupling parameter $ξ$, which turns out to be maximum where the film thickness $h_0$ is comparable to the screening length $λ$. We also present dynamic AFM data for the visco-elastic response of a confined water film in sphere-plane geometry; our theory provides a quantitative description for the electroviscous drag coefficient and the electrostatic repulsion as a function of the film thickness, with the surface charge density as the only free parameter. Charge regulation sets in at even smaller distances.
△ Less
Submitted 28 May, 2022; v1 submitted 4 January, 2022;
originally announced January 2022.
-
On tempered fractional calculus with respect to functions and the associated fractional differential equations
Authors:
Ashwini D. Mali,
Kishor D. Kucche,
Arran Fernandez,
Hafiz Muhammad Fahad
Abstract:
The prime aim of the present paper is to continue developing the theory of tempered fractional integrals and derivatives of a function with respect to another function. This theory combines the tempered fractional calculus with the $Ψ$-fractional calculus, both of which have found applications in topics including continuous time random walks. After studying the basic theory of the $Ψ$-tempered ope…
▽ More
The prime aim of the present paper is to continue developing the theory of tempered fractional integrals and derivatives of a function with respect to another function. This theory combines the tempered fractional calculus with the $Ψ$-fractional calculus, both of which have found applications in topics including continuous time random walks. After studying the basic theory of the $Ψ$-tempered operators, we prove mean value theorems and Taylor's theorems for both Riemann--Liouville type and Caputo type cases of these operators. Furthermore, we study some nonlinear fractional differential equations involving $Ψ$-tempered derivatives, proving existence-uniqueness theorems by using the Banach contraction principle, and proving stability results by using Grönwall type inequalities.
△ Less
Submitted 18 February, 2022; v1 submitted 13 August, 2021;
originally announced August 2021.
-
Backprop-Free Reinforcement Learning with Active Neural Generative Coding
Authors:
Alexander Ororbia,
Ankur Mali
Abstract:
In humans, perceptual awareness facilitates the fast recognition and extraction of information from sensory input. This awareness largely depends on how the human agent interacts with the environment. In this work, we propose active neural generative coding, a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.…
▽ More
In humans, perceptual awareness facilitates the fast recognition and extraction of information from sensory input. This awareness largely depends on how the human agent interacts with the environment. In this work, we propose active neural generative coding, a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. Specifically, we develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference. We demonstrate on several simple control problems that our framework performs competitively with deep Q-learning. The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
△ Less
Submitted 19 December, 2021; v1 submitted 10 July, 2021;
originally announced July 2021.
-
OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas
Authors:
Shivansh Rao,
Vikas Kumar,
Daniel Kifer,
Lee Giles,
Ankur Mali
Abstract:
Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, floor boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. However, the space-varying distortions in panoramic images are not compatible…
▽ More
Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, floor boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. However, the space-varying distortions in panoramic images are not compatible with the translational equivariance property of standard convolutions, thus degrading performance. Instead, we propose to use spherical convolutions. The resulting network, which we call OmniLayout performs convolutions directly on the sphere surface, sampling according to inverse equirectangular projection and hence invariant to equirectangular distortions. Using a new evaluation metric, we show that our network reduces the error in the heavily distorted regions (near the poles) by approx 25 % when compared to standard convolutional networks. Experimental results show that OmniLayout outperforms the state-of-the-art by approx 4% on two different benchmark datasets (PanoContext and Stanford 2D-3D). Code is available at https://github.com/rshivansh/OmniLayout.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units
Authors:
Ankur Mali,
Alexander Ororbia,
Daniel Kifer,
C. Lee Giles
Abstract:
Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation verification, which requires determining whether trigonometric and linear algebraic statements are valid identities or not, and (2) equation completion, which entails fi…
▽ More
Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation verification, which requires determining whether trigonometric and linear algebraic statements are valid identities or not, and (2) equation completion, which entails filling in a blank within an expression to make it true. Solving these tasks with deep learning requires that the neural model learn how to manipulate and compose various algebraic symbols, carrying this ability over to previously unseen expressions. Artificial neural networks, including recurrent networks and transformers, struggle to generalize on these kinds of difficult compositional problems, often exhibiting poor extrapolation performance. In contrast, recursive neural networks (recursive-NNs) are, theoretically, capable of achieving better extrapolation due to their tree-like design but are difficult to optimize as the depth of their underlying tree structure increases. To overcome this issue, we extend recursive-NNs to utilize multiplicative, higher-order synaptic connections and, furthermore, to learn to dynamically control and manipulate an external memory. We argue that this key modification gives the neural system the ability to capture powerful transition functions for each possible input. We demonstrate the effectiveness of our proposed higher-order, memory-augmented recursive-NN models on two challenging mathematical equation tasks, showing improved extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
On Coupled System of Nonlinear $Ψ$-Hilfer Hybrid Fractional Differential Equations
Authors:
Ashwini D. Mali,
Kishor D. Kucche,
J. Vanterler da C. Sousa
Abstract:
This paper is dedicated to investigating the existence of solutions to the initial value problem (IVP) for a coupled system of $Ψ$-Hilfer hybrid fractional differential equations (FDEs) and boundary value problem (BVP) for a coupled system of $Ψ$-Hilfer hybrid FDEs. Analysis of the current paper depends on the two fixed point theorems involving three operators characterized on Banach algebra. In t…
▽ More
This paper is dedicated to investigating the existence of solutions to the initial value problem (IVP) for a coupled system of $Ψ$-Hilfer hybrid fractional differential equations (FDEs) and boundary value problem (BVP) for a coupled system of $Ψ$-Hilfer hybrid FDEs. Analysis of the current paper depends on the two fixed point theorems involving three operators characterized on Banach algebra. In the view of an application, we provided concrete examples to exhibit the effectiveness of our achieved results.
△ Less
Submitted 5 December, 2020;
originally announced December 2020.
-
Contactless rheology of finite-size air-water interfaces
Authors:
Vincent Bertin,
Zaicheng Zhang,
Rodolphe Boisgard,
Christine Grauby-Heywang,
Elie Raphael,
Thomas Salez,
Abdelhamid Maali
Abstract:
We present contactless atomic-force microscopy measurements of the hydrodynamic interactions between a rigid sphere and an air bubble in water at the micro-scale. The size of the bubble is found to have a significant effect on the response due to the long-range capillary deformation of the air-water interface. To rationalize the experimental data, we develop a viscocapillary lubrication model acco…
▽ More
We present contactless atomic-force microscopy measurements of the hydrodynamic interactions between a rigid sphere and an air bubble in water at the micro-scale. The size of the bubble is found to have a significant effect on the response due to the long-range capillary deformation of the air-water interface. To rationalize the experimental data, we develop a viscocapillary lubrication model accounting for the finite-size effect. The comparison between experiments and theory allows us to measure the air-water surface tension, without contact, paving the way towards robust contactless tensiometry of polluted air-water interfaces.
△ Less
Submitted 2 March, 2021; v1 submitted 30 September, 2020;
originally announced September 2020.
-
On the Boundary Value Problems of Ψ -Hilfer Fractional Differential Equations
Authors:
Ashwini D. Mali,
Kishor D. Kucche
Abstract:
In the current paper, we derive the comparison results for the homogeneous and non-homogeneous linear initial value problem (IVP) for $Ψ$-Hilfer fractional differential equations. In the presence of upper and lower solutions, the obtained comparison results and the location of roots theorem utilized to prove the existence and uniqueness of the solution for the linear $Ψ$-Hilfer boundary value prob…
▽ More
In the current paper, we derive the comparison results for the homogeneous and non-homogeneous linear initial value problem (IVP) for $Ψ$-Hilfer fractional differential equations. In the presence of upper and lower solutions, the obtained comparison results and the location of roots theorem utilized to prove the existence and uniqueness of the solution for the linear $Ψ$-Hilfer boundary value problem (BVP) through the linear non-homogeneous $Ψ$-Hilfer IVP. Assuming the existence of lower solution $w_0 $ and upper solution $z_0 $, we establish the existence of minimal and maximal solutions for the nonlinear $Ψ$-Hilfer BVP in the line segment $[w_0,\,z_0]$ of the weighted space $C_{1-\,γ;\, Ψ}\left( J,\,\R\right)$. Further, it demonstrated that the iterative Picard type sequences that began with lower and upper solutions respectively converges to a minimal and maximal solutions, and that started with any point on a line segment converge to the exact solution of nonlinear $Ψ$-Hilfer BVP. Finally, an example is provided in support of the main results we acquired.
△ Less
Submitted 19 September, 2020;
originally announced September 2020.
-
Near-field probe of thermal capillary fluctuations of a hemispherical bubble
Authors:
Zaicheng Zhang,
Yuliang Wang,
Yacine Amarouchene,
Rodolphe Boisgard,
Hamid Kellay,
Alois Würger,
Abdelhamid Maali
Abstract:
We report measurements of resonant thermal capillary oscillations of a hemispherical liquid gas interface obtained using a half bubble deposited on a solid substrate. The thermal motion of the hemispherical interface is investigated using an atomic force microscope cantilever that probes the amplitude of vibrations of this interface versus frequency. The spectrum of such nanoscale thermal oscillat…
▽ More
We report measurements of resonant thermal capillary oscillations of a hemispherical liquid gas interface obtained using a half bubble deposited on a solid substrate. The thermal motion of the hemispherical interface is investigated using an atomic force microscope cantilever that probes the amplitude of vibrations of this interface versus frequency. The spectrum of such nanoscale thermal oscillations of the bubble surface presents several resonance peaks and reveals that the contact line of the hemispherical bubble is pinned on the substrate. The analysis of these peaks allows to measure the surface viscosity of the bubble interface. Minute amounts of impurities are responsible for altering the rheology of the pure water surface.
△ Less
Submitted 8 September, 2020;
originally announced September 2020.
-
On the Nonlinear $Ψ$-Hilfer Hybrid Fractional Differential Equations
Authors:
Kishor D. Kucche,
Ashwini D. Mali
Abstract:
In this paper, we initially derive the equivalent fractional integral equation to $Ψ$-Hilfer hybrid fractional differential equations and through it, we prove the existence of a solution in the weighted space. The primary objective of the paper is to obtain estimates on $Ψ$-Hilfer derivative and utilize it to derive the hybrid fractional differential inequalities involving $Ψ$-Hilfer derivative. W…
▽ More
In this paper, we initially derive the equivalent fractional integral equation to $Ψ$-Hilfer hybrid fractional differential equations and through it, we prove the existence of a solution in the weighted space. The primary objective of the paper is to obtain estimates on $Ψ$-Hilfer derivative and utilize it to derive the hybrid fractional differential inequalities involving $Ψ$-Hilfer derivative. With the assistance of these fractional differential inequalities, we determine the existence of extremal solutions, comparison theorems and uniqueness of the solution.
△ Less
Submitted 14 August, 2020;
originally announced August 2020.
-
A provably stable neural network Turing Machine
Authors:
John Stogin,
Ankur Mali,
C Lee Giles
Abstract:
We introduce a neural stack architecture, including a differentiable parametrized stack operator that approximates stack push and pop operations for suitable choices of parameters that explicitly represents a stack. We prove the stability of this stack architecture: after arbitrarily many stack operations, the state of the neural stack still closely resembles the state of the discrete stack. Using…
▽ More
We introduce a neural stack architecture, including a differentiable parametrized stack operator that approximates stack push and pop operations for suitable choices of parameters that explicitly represents a stack. We prove the stability of this stack architecture: after arbitrarily many stack operations, the state of the neural stack still closely resembles the state of the discrete stack. Using the neural stack with a recurrent neural network, we introduce a neural network Pushdown Automaton (nnPDA) and prove that nnPDA with finite/bounded neurons and time can simulate any PDA. Furthermore, we extend our construction and propose new architecture neural state Turing Machine (nnTM). We prove that differentiable nnTM with bounded neurons can simulate Turing Machine (TM) in real-time. Just like the neural stack, these architectures are also stable. Finally, we extend our construction to show that differentiable nnTM is equivalent to Universal Turing Machine (UTM) and can simulate any TM with only \textbf{seven finite/bounded precision} neurons. This work provides a new theoretical bound for the computational capability of bounded precision RNNs augmented with memory.
△ Less
Submitted 18 September, 2022; v1 submitted 5 June, 2020;
originally announced June 2020.
-
Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack
Authors:
Ankur Mali,
Alexander Ororbia,
Daniel Kifer,
Clyde Lee Giles
Abstract:
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. Despite success in applications such as machine translation and voice recognition, these stateful models have several critical shortcomings. Specifically, RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and t…
▽ More
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. Despite success in applications such as machine translation and voice recognition, these stateful models have several critical shortcomings. Specifically, RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems. For example, RNNs struggle in recognizing complex context free languages (CFLs), never reaching 100% accuracy on training. One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack. However, differentiable memories in prior work have neither been extensively studied on CFLs nor tested on sequences longer than those seen in training. The few efforts that have studied them have shown that continuous differentiable memory structures yield poor generalization for complex CFLs, making the RNN less interpretable. In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms that ensure that the model learns to properly balance the use of its latent states with external memory. Our improved RNN models exhibit better generalization performance and are able to classify long strings generated by complex hierarchical context free grammars (CFGs). We evaluate our models on CGGs, including the Dyck languages, as well as on the Penn Treebank language modelling task, and achieve stable, robust performance across these benchmarks. Furthermore, we show that only our memory-augmented networks are capable of retaining memory for a longer duration up to strings of length 160.
△ Less
Submitted 22 April, 2020; v1 submitted 4 April, 2020;
originally announced April 2020.
-
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment
Authors:
Alexander Ororbia,
Ankur Mali,
Daniel Kifer,
C. Lee Giles
Abstract:
Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop vario…
▽ More
Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop various tricks, such as specialized weight initializations and activation functions, in order to ensure a stable parameter optimization. Our goal is to seek an effective, neuro-biologically-plausible alternative to backprop that can be used to train deep networks. In this paper, we propose a gradient-free learning procedure, recursive local representation alignment, for training large-scale neural architectures. Experiments with residual networks on CIFAR-10 and the large benchmark, ImageNet, show that our algorithm generalizes as well as backprop while converging sooner due to weight updates that are parallelizable and computationally less demanding. This is empirical evidence that a backprop-free algorithm can scale up to larger datasets.
△ Less
Submitted 18 September, 2020; v1 submitted 10 February, 2020;
originally announced February 2020.
-
Nonlocal Boundary Value Problem for Generalized Hilfer Implicit Fractional Differential Equations
Authors:
Ashwini D. Mali,
Kishor D. Kucche
Abstract:
In this paper, we derive the equivalent fractional integral equation to the nonlinear implicit fractional differential equations involving $\varphi$-Hilfer fractional derivative subject to nonlocal fractional integral boundary conditions. The existence of a solution, Ulam-Hyers, and Ulam-Hyers-Rassias stability has been acquired by means equivalent fractional integral equation. Our investigations…
▽ More
In this paper, we derive the equivalent fractional integral equation to the nonlinear implicit fractional differential equations involving $\varphi$-Hilfer fractional derivative subject to nonlocal fractional integral boundary conditions. The existence of a solution, Ulam-Hyers, and Ulam-Hyers-Rassias stability has been acquired by means equivalent fractional integral equation. Our investigations depend on the fixed point theorem due to Krasnoselskii and the Gronwall inequality involving $\varphi$-Riemann--Liouville fractional integral. An example is provided to show the utilization of primary outcomes.
△ Less
Submitted 23 January, 2020;
originally announced January 2020.
-
Sibling Neural Estimators: Improving Iterative Image Decoding with Gradient Communication
Authors:
Ankur Mali,
Alexander G. Ororbia,
Clyde Lee Giles
Abstract:
For lossy image compression, we develop a neural-based system which learns a nonlinear estimator for decoding from quantized representations. The system links two recurrent networks that \help" each other reconstruct same target image patches using complementary portions of spatial context that communicate via gradient signals. This dual agent system builds upon prior work that proposed the iterat…
▽ More
For lossy image compression, we develop a neural-based system which learns a nonlinear estimator for decoding from quantized representations. The system links two recurrent networks that \help" each other reconstruct same target image patches using complementary portions of spatial context that communicate via gradient signals. This dual agent system builds upon prior work that proposed the iterative refinement algorithm for recurrent neural network (RNN)based decoding which improved image reconstruction compared to standard decoding techniques. Our approach, which works with any encoder, neural or non-neural, This system progressively reduces image patch reconstruction error over a fixed number of steps. Experiment with variants of RNN memory cells, with and without future information, find that our model consistently creates lower distortion images of higher perceptual quality compared to other approaches. Specifically, on the Kodak Lossless True Color Image Suite, we observe as much as a 1:64 decibel (dB) gain over JPEG, a 1:46 dB gain over JPEG 2000, a 1:34 dB gain over the GOOG neural baseline, 0:36 over E2E (a modern competitive neural compression model), and 0:37 over a single iterative neural decoder.
△ Less
Submitted 19 November, 2019;
originally announced November 2019.
-
The Neural State Pushdown Automata
Authors:
Ankur Mali,
Alexander Ororbia,
C. Lee Giles
Abstract:
In order to learn complex grammars, recurrent neural networks (RNNs) require sufficient computational resources to ensure correct grammar recognition. A widely-used approach to expand model capacity would be to couple an RNN to an external memory stack. Here, we introduce a "neural state" pushdown automaton (NSPDA), which consists of a digital stack, instead of an analog one, that is coupled to a…
▽ More
In order to learn complex grammars, recurrent neural networks (RNNs) require sufficient computational resources to ensure correct grammar recognition. A widely-used approach to expand model capacity would be to couple an RNN to an external memory stack. Here, we introduce a "neural state" pushdown automaton (NSPDA), which consists of a digital stack, instead of an analog one, that is coupled to a neural network state machine. We empirically show its effectiveness in recognizing various context-free grammars (CFGs). First, we develop the underlying mechanics of the proposed higher order recurrent network and its manipulation of a stack as well as how to stably program its underlying pushdown automaton (PDA) to achieve desired finite-state network dynamics. Next, we introduce a noise regularization scheme for higher-order (tensor) networks, to our knowledge the first of its kind, and design an algorithm for improved incremental learning. Finally, we design a method for inserting grammar rules into a NSPDA and empirically show that this prior knowledge improves its training convergence time by an order of magnitude and, in some cases, leads to better generalization. The NSPDA is also compared to a classical analog stack neural network pushdown automaton (NNPDA) as well as a wide array of first and second-order RNNs with and without external memory, trained using different learning algorithms. Our results show that, for Dyck(2) languages, prior rule-based knowledge is critical for optimization convergence and for ensuring generalization to longer sequences at test time. We observe that many RNNs with and without memory, but no prior knowledge, fail to converge and generalize poorly on CFGs.
△ Less
Submitted 19 September, 2019; v1 submitted 6 September, 2019;
originally announced September 2019.
-
Direct Measurement of the Elastohydrodynamic Lift Force at the Nanoscale
Authors:
Zaicheng Zhang,
Vincent Bertin,
Muhammad Arshad,
Elie Raphael,
Thomas Salez,
Abdelhamid Maali
Abstract:
We present the first direct measurement of the elastohydrodynamic lift force acting on a sphere moving within a viscous liquid, near and along a soft substrate under nanometric confinement. Using atomic force microscopy, the lift force is probed as a function of the gap size, for various driving velocities, viscosities, and stiffnesses. The force increases as the gap is reduced and shows a saturat…
▽ More
We present the first direct measurement of the elastohydrodynamic lift force acting on a sphere moving within a viscous liquid, near and along a soft substrate under nanometric confinement. Using atomic force microscopy, the lift force is probed as a function of the gap size, for various driving velocities, viscosities, and stiffnesses. The force increases as the gap is reduced and shows a saturation at small gap. The results are in excellent agreement with scaling arguments and a quantitative model developed from the soft lubrication theory, in linear elasticity, and for small compliances. For larger compliances, or equivalently for smaller confinement length scales, an empirical scaling law for the observed saturation of the lift force is given and discussed.
△ Less
Submitted 6 November, 2019; v1 submitted 1 July, 2019;
originally announced July 2019.
-
Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting
Authors:
Alexander Ororbia,
Ankur Mali,
Daniel Kifer,
C. Lee Giles
Abstract:
In lifelong learning systems based on artificial neural networks, one of the biggest obstacles is the inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points…
▽ More
In lifelong learning systems based on artificial neural networks, one of the biggest obstacles is the inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the popular back-propagation of errors. Grounded in the neurocognitive theory of predictive processing, our model adapts synapses in a biologically-plausible fashion while another neural system learns to direct and control this cortex-like structure, mimicking some of the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting compared to standard neural models, outperforming a swath of previously proposed methods, including rehearsal/data buffer-based methods, on both standard (SplitMNIST, Split Fashion MNIST, etc.) and custom benchmarks even though it is trained in a stream-like fashion. Our work offers evidence that emulating mechanisms in real neuronal systems, e.g., local learning, lateral competition, can yield new directions and possibilities for tackling the grand challenge of lifelong machine learning.
△ Less
Submitted 14 August, 2022; v1 submitted 25 May, 2019;
originally announced May 2019.
-
Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations
Authors:
Alexander Ororbia,
Ankur Mali,
C. Lee Giles,
Daniel Kifer
Abstract:
Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not per…
▽ More
Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not permit the use of non-differentiable activation functions and is inherently sequential, making parallelization of the underlying training process difficult. Here, we propose the Parallel Temporal Neural Coding Network (P-TNCN), a biologically inspired model trained by the learning algorithm we call Local Representation Alignment. It aims to resolve the difficulties and problems that plague recurrent networks trained by back-propagation through time. The architecture requires neither unrolling in time nor the derivatives of its internal activation functions. We compare our model and learning procedure to other back-propagation through time alternatives (which also tend to be computationally expensive), including real-time recurrent learning, echo state networks, and unbiased online recurrent optimization. We show that it outperforms these on sequence modeling benchmarks such as Bouncing MNIST, a new benchmark we denote as Bouncing NotMNIST, and Penn Treebank. Notably, our approach can in some instances outperform full back-propagation through time as well as variants such as sparse attentive back-tracking. Significantly, the hidden unit correction phase of P-TNCN allows it to adapt to new datasets even if its synaptic weights are held fixed (zero-shot adaptation) and facilitates retention of prior generative knowledge when faced with a task sequence. We present results that show the P-TNCN's ability to conduct zero-shot adaptation and online continual sequence modeling.
△ Less
Submitted 10 August, 2019; v1 submitted 17 October, 2018;
originally announced October 2018.
-
A Neural Temporal Model for Human Motion Prediction
Authors:
Anand Gopalakrishnan,
Ankur Mali,
Dan Kifer,
C. Lee Giles,
Alexander G. Ororbia
Abstract:
We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work in short-term prediction and requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids in generating planned trajectories…
▽ More
We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work in short-term prediction and requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids in generating planned trajectories, 2) a simple set of easily computable features that integrate derivative information, and 3) a novel multi-objective loss function that helps the model to slowly progress from simple next-step prediction to the harder task of multi-step, closed-loop prediction. Our results demonstrate that these innovations improve the modeling of long-term motion trajectories. Finally, we propose a novel metric, called Normalized Power Spectrum Similarity (NPSS), to evaluate the long-term predictive ability of motion synthesis models, complementing the popular mean-squared error (MSE) measure of Euler joint angles over time. We conduct a user study to determine if the proposed NPSS correlates with human evaluation of long-term motion more strongly than MSE and find that it indeed does. We release code and additional results (visualizations) for this paper at: https://github.com/cr7anand/neural_temporal_models
△ Less
Submitted 22 November, 2019; v1 submitted 9 September, 2018;
originally announced September 2018.
-
On the Nonlinear $Ψ$-Hilfer Fractional Differential Equations
Authors:
Kishor D. Kucche,
Ashwini D. Mali,
J. Vanterler da C. Sousa
Abstract:
We consider the nonlinear Cauchy problem for $ Ψ$- Hilfer fractional differential equations and investigate the existence, interval of existence and uniqueness of solution in the weighted space of functions. The continuous dependence of solutions on initial conditions is proved via Weissinger fixed point theorem. Picard's successive approximation method has been developed to solve nonlinear Cauchy…
▽ More
We consider the nonlinear Cauchy problem for $ Ψ$- Hilfer fractional differential equations and investigate the existence, interval of existence and uniqueness of solution in the weighted space of functions. The continuous dependence of solutions on initial conditions is proved via Weissinger fixed point theorem. Picard's successive approximation method has been developed to solve nonlinear Cauchy problem for differential equations with $ Ψ$- Hilfer fractional derivative and an estimation have been obtained for the error bound. Further, by Picard's successive approximation, we derive the representation formula for the solution of linear Cauchy problem for $ Ψ$-Hilfer fractional differential equation with constant coefficient and variable coefficient in terms of Mittag-Leffler function and Generalized (Kilbas-Saigo) Mittag-Leffler function.
△ Less
Submitted 28 September, 2018; v1 submitted 5 August, 2018;
originally announced August 2018.
-
Biologically Motivated Algorithms for Propagating Local Target Representations
Authors:
Alexander G. Ororbia,
Ankur Mali
Abstract:
Finding biologically plausible alternatives to back-propagation of errors is a fundamentally important challenge in artificial neural network research. In this paper, we propose a learning algorithm called error-driven Local Representation Alignment (LRA-E), which has strong connections to predictive coding, a theory that offers a mechanistic way of describing neurocomputational machinery. In addi…
▽ More
Finding biologically plausible alternatives to back-propagation of errors is a fundamentally important challenge in artificial neural network research. In this paper, we propose a learning algorithm called error-driven Local Representation Alignment (LRA-E), which has strong connections to predictive coding, a theory that offers a mechanistic way of describing neurocomputational machinery. In addition, we propose an improved variant of Difference Target Propagation, another procedure that comes from the same family of algorithms as LRA-E. We compare our procedures to several other biologically-motivated algorithms, including two feedback alignment algorithms and Equilibrium Propagation. In two benchmarks, we find that both of our proposed algorithms yield stable performance and strong generalization compared to other competing back-propagation alternatives when training deeper, highly nonlinear networks, with LRA-E performing the best overall.
△ Less
Submitted 15 November, 2018; v1 submitted 26 May, 2018;
originally announced May 2018.
-
Like a Baby: Visually Situated Neural Language Acquisition
Authors:
Alexander G. Ororbia,
Ankur Mali,
Matthew A. Kelly,
David Reitter
Abstract:
We examine the benefits of visual context in training neural language models to perform next-word prediction. A multi-modal neural architecture is introduced that outperform its equivalent trained on language alone with a 2\% decrease in perplexity, even when no visual context is available at test. Fine-tuning the embeddings of a pre-trained state-of-the-art bidirectional language model (BERT) in…
▽ More
We examine the benefits of visual context in training neural language models to perform next-word prediction. A multi-modal neural architecture is introduced that outperform its equivalent trained on language alone with a 2\% decrease in perplexity, even when no visual context is available at test. Fine-tuning the embeddings of a pre-trained state-of-the-art bidirectional language model (BERT) in the language modeling framework yields a 3.5\% improvement. The advantage for training with visual context when testing without is robust across different languages (English, German and Spanish) and different models (GRU, LSTM, $Δ$-RNN, as well as those that use BERT embeddings). Thus, language models perform better when they learn like a baby, i.e, in a multi-modal environment. This finding is compatible with the theory of situated cognition: language is inseparable from its physical context.
△ Less
Submitted 4 June, 2019; v1 submitted 29 May, 2018;
originally announced May 2018.
-
Learned Neural Iterative Decoding for Lossy Image Compression Systems
Authors:
Alexander G. Ororbia,
Ankur Mali,
Jian Wu,
Scott O'Connell,
David Miller,
C. Lee Giles
Abstract:
For lossy image compression systems, we develop an algorithm, iterative refinement, to improve the decoder's reconstruction compared to standard decoding techniques. Specifically, we propose a recurrent neural network approach for nonlinear, iterative decoding. Our decoder, which works with any encoder, employs self-connected memory units that make use of causal and non-causal spatial context info…
▽ More
For lossy image compression systems, we develop an algorithm, iterative refinement, to improve the decoder's reconstruction compared to standard decoding techniques. Specifically, we propose a recurrent neural network approach for nonlinear, iterative decoding. Our decoder, which works with any encoder, employs self-connected memory units that make use of causal and non-causal spatial context information to progressively reduce reconstruction error over a fixed number of steps. We experiment with variants of our estimator and find that iterative refinement consistently creates lower distortion images of higher perceptual quality compared to other approaches. Specifically, on the Kodak Lossless True Color Image Suite, we observe as much as a 0.871 decibel (dB) gain over JPEG, a 1.095 dB gain over JPEG 2000, and a 0.971 dB gain over a competitive neural model.
△ Less
Submitted 10 November, 2018; v1 submitted 15 March, 2018;
originally announced March 2018.
-
Conducting Credit Assignment by Aligning Local Representations
Authors:
Alexander G. Ororbia,
Ankur Mali,
Daniel Kifer,
C. Lee Giles
Abstract:
Using back-propagation and its variants to train deep networks is often problematic for new users. Issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often make networks difficult to train, especially when users are experimenting with new architectures. Here, we present Local Representation Alignment (LRA), a training procedure that is…
▽ More
Using back-propagation and its variants to train deep networks is often problematic for new users. Issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often make networks difficult to train, especially when users are experimenting with new architectures. Here, we present Local Representation Alignment (LRA), a training procedure that is much less sensitive to bad initializations, does not require modifications to the network architecture, and can be adapted to networks with highly nonlinear and discrete-valued activation functions. Furthermore, we show that one variation of LRA can start with a null initialization of network weights and still successfully train networks with a wide variety of nonlinearities, including tanh, ReLU-6, softplus, signum and others that may draw their inspiration from biology.
A comprehensive set of experiments on MNIST and the much harder Fashion MNIST data sets show that LRA can be used to train networks robustly and effectively, succeeding even when back-propagation fails and outperforming other alternative learning algorithms, such as target propagation and feedback alignment.
△ Less
Submitted 12 July, 2018; v1 submitted 5 March, 2018;
originally announced March 2018.
-
Visco-elastic drag forces and crossover from no-slip to slip boundary conditions for flow near air-water interfaces
Authors:
Abdelhamid Maali,
Rodolphe Boisgard,
Hamza Chraibi,
Zaicheng Zhang,
Hamid Kellay,
Alois Würger
Abstract:
The "free" water surface is generally prone to contamination with surface impurities be they surfactants, particles or other surface active agents. The presence of such impurities can modify flow boundary near such interfaces in a drastic manner. Here we show that vibrating a small sphere mounted on an AFM cantilever near a gas bubble immersed in water, is an excellent probe of surface contaminati…
▽ More
The "free" water surface is generally prone to contamination with surface impurities be they surfactants, particles or other surface active agents. The presence of such impurities can modify flow boundary near such interfaces in a drastic manner. Here we show that vibrating a small sphere mounted on an AFM cantilever near a gas bubble immersed in water, is an excellent probe of surface contamination. Both viscous and elastic forces are exerted by an air-water interface on the vibrating sphere even when very low doses of contaminants are present. The viscous drag forces show a cross-over from no-slip to slip boundary conditions while the elastic forces show a nontrivial variation as the vibration frequency changes. We provide a simple model to rationalize these results and propose a simple way of evaluating the concentration of such surface impurities.
△ Less
Submitted 10 March, 2017;
originally announced March 2017.
-
On Acyclic Edge-Coloring of Complete Bipartite Graphs
Authors:
Ayineedi Venkateswarlu,
Santanu Sarkar,
A. Sai Mali
Abstract:
An acyclic edge-coloring of a graph is a proper edge-coloring without bichromatic ($2$-colored) cycles. The acyclic chromatic index of a graph $G$, denoted by $a'(G)$, is the least integer $k$ such that $G$ admits an acyclic edge-coloring using $k$ colors. Let $Δ= Δ(G)$ denote the maximum degree of a vertex in a graph $G$. A complete bipartite graph with $n$ vertices on each side is denoted by…
▽ More
An acyclic edge-coloring of a graph is a proper edge-coloring without bichromatic ($2$-colored) cycles. The acyclic chromatic index of a graph $G$, denoted by $a'(G)$, is the least integer $k$ such that $G$ admits an acyclic edge-coloring using $k$ colors. Let $Δ= Δ(G)$ denote the maximum degree of a vertex in a graph $G$. A complete bipartite graph with $n$ vertices on each side is denoted by $K_{n,n}$. Basavaraju, Chandran and Kummini proved that $a'(K_{n,n}) \ge n+2 = Δ+ 2$ when $n$ is odd. Basavaraju and Chandran provided an acyclic edge-coloring of $K_{p,p}$ using $p+2$ colors and thus establishing $a'(K_{p,p}) = p+2 = Δ+ 2$ when $p$ is an odd prime. The main tool in their approach is perfect $1$-factorization of $K_{p,p}$. Recently, following their approach, Venkateswarlu and Sarkar have shown that $K_{2p-1,2p-1}$ admits an acyclic edge-coloring using $2p+1$ colors which implies that $a'(K_{2p-1,2p-1}) = 2p+1 = Δ+ 2$, where $p$ is an odd prime. In this paper, we generalize this approach and present a general framework to possibly get an acyclic edge-coloring of $K_{n,n}$ which possess a perfect $1$-factorization using $n+2 = Δ+2$ colors. In this general framework, we show that $K_{p^2,p^2}$ admits an acyclic edge-coloring using $p^2+2$ colors and thus establishing $a'(K_{p^2,p^2}) = p^2+2 = Δ+ 2$ when $p\ge 5$ is an odd prime.
△ Less
Submitted 11 March, 2015;
originally announced March 2015.
-
Adaptive Push-Then-Pull Gossip Algorithm for Scale-free Networks
Authors:
Ruchir Gupta,
Abhijeet C. Maali,
Yatindra Nath Singh
Abstract:
Real life networks are generally modelled as scale free networks. Information diffusion in such networks in decentralised environment is a difficult and resource consuming affair. Gossip algorithms have come up as a good solution to this problem. In this paper, we have proposed Adaptive First Push Then Pull gossip algorithm. We show that algorithm works with minimum cost when the transition round…
▽ More
Real life networks are generally modelled as scale free networks. Information diffusion in such networks in decentralised environment is a difficult and resource consuming affair. Gossip algorithms have come up as a good solution to this problem. In this paper, we have proposed Adaptive First Push Then Pull gossip algorithm. We show that algorithm works with minimum cost when the transition round to switch from Adaptive Push to Adaptive Pull is close to Round(log(N)). Furthermore, we compare our algorithm with Push, Pull and First Push Then Pull and show that the proposed algorithm is the most cost efficient in Scale Free networks.
△ Less
Submitted 22 October, 2013;
originally announced October 2013.