Search | arXiv e-print repository

Minimizing Schrödinger eigenvalues for confining potentials

Abstract: We consider the problem of minimizing the lowest eigenvalue of the Schrödinger operator $-Δ+V$ in $L^2(\mathbb R^d)$ when the integral $\int e^{-tV}\,dx$ is given for some $t>0$. We show that the eigenvalue is minimal for the harmonic oscillator and derive a quantitative version of the corresponding inequality. We consider the problem of minimizing the lowest eigenvalue of the Schrödinger operator $-Δ+V$ in $L^2(\mathbb R^d)$ when the integral $\int e^{-tV}\,dx$ is given for some $t>0$. We show that the eigenvalue is minimal for the harmonic oscillator and derive a quantitative version of the corresponding inequality. △ Less

Submitted 21 July, 2024; originally announced July 2024.

Comments: 8 pages

arXiv:2407.11808 [pdf, ps, other]

Riesz means asymptotics for Dirichlet and Neumann Laplacians on Lipschitz domains

Authors: Rupert L. Frank, Simon Larson

Abstract: We consider the eigenvalues of the Dirichlet and Neumann Laplacians on a bounded domain with Lipschitz boundary and prove two-term asymptotics for their Riesz means of arbitrary positive order. Moreover, when the underlying domain is convex, we obtain universal, non-asymptotic bounds that correctly reproduce the two leading terms in the asymptotics and depend on the domain only through simple geom… ▽ More We consider the eigenvalues of the Dirichlet and Neumann Laplacians on a bounded domain with Lipschitz boundary and prove two-term asymptotics for their Riesz means of arbitrary positive order. Moreover, when the underlying domain is convex, we obtain universal, non-asymptotic bounds that correctly reproduce the two leading terms in the asymptotics and depend on the domain only through simple geometric characteristics. An important ingredient in the proof of the latter result is a pointwise bound for the heat kernel of the Neumann Laplacian in a convex domain with universal constants. Additional ingredients in our proof are non-asymptotic versions of various Tauberian theorems. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2406.18501 [pdf, other]

Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming

Authors: Zhenghao Zhou, Robert Frank, R. Thomas McCoy

Abstract: Large language models (LLMs) have shown the emergent capability of in-context learning (ICL). One line of research has explained ICL as functionally performing gradient descent. In this paper, we introduce a new way of diagnosing whether ICL is functionally equivalent to gradient-based learning. Our approach is based on the inverse frequency effect (IFE) -- a phenomenon in which an error-driven le… ▽ More Large language models (LLMs) have shown the emergent capability of in-context learning (ICL). One line of research has explained ICL as functionally performing gradient descent. In this paper, we introduce a new way of diagnosing whether ICL is functionally equivalent to gradient-based learning. Our approach is based on the inverse frequency effect (IFE) -- a phenomenon in which an error-driven learner is expected to show larger updates when trained on infrequent examples than frequent ones. The IFE has previously been studied in psycholinguistics because humans show this effect in the context of structural priming (the tendency for people to produce sentence structures they have encountered recently); the IFE has been used as evidence that human structural priming must involve error-driven learning mechanisms. In our experiments, we simulated structural priming within ICL and found that LLMs display the IFE, with the effect being stronger in larger models. We conclude that ICL is indeed a type of gradient-based learning, supporting the hypothesis that a gradient component is implicitly computed in the forward pass during ICL. Our results suggest that both humans and LLMs make use of gradient-based, error-driven processing mechanisms. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2405.10652 [pdf, ps, other]

Endpoint Schatten class properties of commutators

Authors: Rupert L. Frank, Fedor Sukochev, Dmitriy Zanin

Abstract: We study trace ideal properties of the commutators $[(-Δ)^{\fracε{2}},M_f]$ of a power of the Laplacian with the multiplication operator by a function $f$ on $\mathbb R^d$. For a certain range of $ε\in\mathbb R$, we show that this commutator belongs to the weak Schatten class $\mathcal L_{\frac d{1-ε},\infty}$ if and only if the distributional gradient of $f$ belongs to $L_{\frac d{1-ε}}$. Moreove… ▽ More We study trace ideal properties of the commutators $[(-Δ)^{\fracε{2}},M_f]$ of a power of the Laplacian with the multiplication operator by a function $f$ on $\mathbb R^d$. For a certain range of $ε\in\mathbb R$, we show that this commutator belongs to the weak Schatten class $\mathcal L_{\frac d{1-ε},\infty}$ if and only if the distributional gradient of $f$ belongs to $L_{\frac d{1-ε}}$. Moreover, in this case we determine the asymptotics of the singular values. Our proofs use, among other things, the tool of Double Operator Integrals. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 41 pages

arXiv:2405.03572 [pdf, other]

RoboCar: A Rapidly Deployable Open-Source Platform for Autonomous Driving Research

Authors: Mehdi Testouri, Gamal Elghazaly, Raphael Frank

Abstract: This paper introduces RoboCar, an open-source research platform for autonomous driving developed at the University of Luxembourg. RoboCar provides a modular, cost-effective framework for the development of experimental Autonomous Driving Systems (ADS), utilizing the 2018 KIA Soul EV. The platform integrates a robust hardware and software architecture that aligns with the vehicle's existing systems… ▽ More This paper introduces RoboCar, an open-source research platform for autonomous driving developed at the University of Luxembourg. RoboCar provides a modular, cost-effective framework for the development of experimental Autonomous Driving Systems (ADS), utilizing the 2018 KIA Soul EV. The platform integrates a robust hardware and software architecture that aligns with the vehicle's existing systems, minimizing the need for extensive modifications. It supports various autonomous driving functions and has undergone real-world testing on public roads in Luxembourg City. This paper outlines the platform's architecture, integration challenges, and initial test results, offering insights into its application in advancing autonomous driving research. RoboCar is available to anyone at https://github.com/sntubix/robocar and is released under an open-source MIT license. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.13163 [pdf, other]

A national longitudinal dataset of skills taught in U.S. higher education curricula

Authors: Alireza Javadian Sabet, Sarah H. Bana, Renzhe Yu, Morgan R. Frank

Abstract: Higher education plays a critical role in driving an innovative economy by equipping students with knowledge and skills demanded by the workforce. While researchers and practitioners have developed data systems to track detailed occupational skills, such as those established by the U.S. Department of Labor (DOL), much less effort has been made to document skill development in higher education at a… ▽ More Higher education plays a critical role in driving an innovative economy by equipping students with knowledge and skills demanded by the workforce. While researchers and practitioners have developed data systems to track detailed occupational skills, such as those established by the U.S. Department of Labor (DOL), much less effort has been made to document skill development in higher education at a similar granularity. Here, we fill this gap by presenting a longitudinal dataset of skills inferred from over three million course syllabi taught at nearly three thousand U.S. higher education institutions. To construct this dataset, we apply natural language processing to extract from course descriptions detailed workplace activities (DWAs) used by the DOL to describe occupations. We then aggregate these DWAs to create skill profiles for institutions and academic majors. Our dataset offers a large-scale representation of college-educated workers and their role in the economy. To showcase the utility of this dataset, we use it to 1) compare the similarity of skills taught and skills in the workforce according to the US Bureau of Labor Statistics, 2) estimate gender differences in acquired skills based on enrollment data, 3) depict temporal trends in the skills taught in social science curricula, and 4) connect college majors' skill distinctiveness to salary differences of graduates. Overall, this dataset can enable new research on the source of skills in the context of workforce development and provide actionable insights for shaping the future of higher education to meet evolving labor demands especially in the face of new technologies. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 44 pages, 21 figures, 10 tables

arXiv:2404.09926 [pdf, ps, other]

Lieb-Thirring inequality for the 2D Pauli operator

Authors: Rupert L. Frank, Hynek Kovařík

Abstract: By the Aharonov-Casher theorem, the Pauli operator $P$ has no zero eigenvalue when the normalized magnetic flux $α$ satisfies $|α|<1$, but it does have a zero energy resonance. We prove that in this case a Lieb-Thirring inequality for the $γ$-th moment of the eigenvalues of $P+V$ is valid under the optimal restrictions $γ\geq |α|$ and $γ>0$. Besides the usual semiclassical integral, the right side… ▽ More By the Aharonov-Casher theorem, the Pauli operator $P$ has no zero eigenvalue when the normalized magnetic flux $α$ satisfies $|α|<1$, but it does have a zero energy resonance. We prove that in this case a Lieb-Thirring inequality for the $γ$-th moment of the eigenvalues of $P+V$ is valid under the optimal restrictions $γ\geq |α|$ and $γ>0$. Besides the usual semiclassical integral, the right side of our inequality involves an integral where the zero energy resonance state appears explicitly. Our inequality improves earlier works that were restricted to moments of order $γ\geq 1$. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 31 pages

arXiv:2403.12640 [pdf, ps, other]

Hardy inequalities for large fermionic systems

Authors: Rupert L. Frank, Thomas Hoffmann-Ostenhof, Ari Laptev, Jan Philip Solovej

Abstract: Given $0<s<\frac d2$ with $s\leq 1$, we are interested in the large $N$-behavior of the optimal constant $κ_N$ in the Hardy inequality $\sum_{n=1}^N (-Δ_n)^s \geq κ_N \sum_{n<m} |X_n-X_m|^{-2s}$, when restricted to antisymmetric functions. We show that $N^{1-\frac{2s}d}κ_N$ has a positive, finite limit given by a certain variational problem, thereby generalizing a result of Lieb and Yau related to… ▽ More Given $0<s<\frac d2$ with $s\leq 1$, we are interested in the large $N$-behavior of the optimal constant $κ_N$ in the Hardy inequality $\sum_{n=1}^N (-Δ_n)^s \geq κ_N \sum_{n<m} |X_n-X_m|^{-2s}$, when restricted to antisymmetric functions. We show that $N^{1-\frac{2s}d}κ_N$ has a positive, finite limit given by a certain variational problem, thereby generalizing a result of Lieb and Yau related to the Chandrasekhar theory of gravitational collapse. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 27 pages; dedicated to Brian Davies, in admiration, on the occasion of his 80th birthday

arXiv:2403.06301 [pdf, other]

LIEDER: Linguistically-Informed Evaluation for Discourse Entity Recognition

Authors: Xiaomeng Zhu, Robert Frank

Abstract: Discourse Entity (DE) recognition is the task of identifying novel and known entities introduced within a text. While previous work has found that large language models have basic, if imperfect, DE recognition abilities (Schuster and Linzen, 2022), it remains largely unassessed which of the fundamental semantic properties that govern the introduction and subsequent reference to DEs they have knowl… ▽ More Discourse Entity (DE) recognition is the task of identifying novel and known entities introduced within a text. While previous work has found that large language models have basic, if imperfect, DE recognition abilities (Schuster and Linzen, 2022), it remains largely unassessed which of the fundamental semantic properties that govern the introduction and subsequent reference to DEs they have knowledge of. We propose the Linguistically-Informed Evaluation for Discourse Entity Recognition (LIEDER) dataset that allows for a detailed examination of language models' knowledge of four crucial semantic properties: existence, uniqueness, plurality, and novelty. We find evidence that state-of-the-art large language models exhibit sensitivity to all of these properties except novelty, which demonstrates that they have yet to reach human-level language understanding abilities. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2403.00357 [pdf, ps, other]

On the sharp constants in the regional fractional Sobolev inequalities

Authors: Rupert L. Frank, Tianling Jin, Wei Wang

Abstract: In this paper, we study the sharp constants in fractional Sobolev inequalities associated with the regional fractional Laplacian in domains. In this paper, we study the sharp constants in fractional Sobolev inequalities associated with the regional fractional Laplacian in domains. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.13953 [pdf, ps, other]

On Courant and Pleijel theorems for sub-Riemannian Laplacians

Authors: Rupert L. Frank, Bernard Helffer

Abstract: We are interested in the number of nodal domains of eigenfunctions of sub-Laplacians on sub-Riemannian manifolds. Specifically, we investigate the validity of Pleijel's theorem, which states that, as soon as the dimension is strictly larger than 1, the number of nodal domains of an eigenfunction corresponding to the k-th eigenvalue is strictly (and uniformly, in a certain sense) smaller than k for… ▽ More We are interested in the number of nodal domains of eigenfunctions of sub-Laplacians on sub-Riemannian manifolds. Specifically, we investigate the validity of Pleijel's theorem, which states that, as soon as the dimension is strictly larger than 1, the number of nodal domains of an eigenfunction corresponding to the k-th eigenvalue is strictly (and uniformly, in a certain sense) smaller than k for large k. In the first part of this paper we reduce this question from the case of general sub-Riemannian manifolds to that of nilpotent groups. In the second part, we analyze in detail the case where the nilpotent group is a Heisenberg group times a Euclidean space. Along the way we improve known bounds on the optimal constants in the Faber-Krahn and isoperimetric inequalities on these groups. △ Less

Submitted 29 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: 77 pages; v2 contains an improvement of the results in Theorem 7.2; v3 contains an improvement of the result in Theorem 4.1

arXiv:2402.08527 [pdf, ps, other]

A short review on Improvements and stability for some interpolation inequalities

Authors: Jean Dolbeault, Maria J. Esteban, Alessio Figalli, Rupert Frank, Michael Loss

Abstract: In this paper, we present recent stability results with explicit and dimensionally sharp constants and optimal norms for the Sobolev inequality and for the Gaussian logarithmic Sobolev inequality obtained by the authors in [24]. The stability for the Gaussian logarithmic Sobolev inequality was obtained as a byproduct of the stability for the Sobolev inequality. Here we give a new, direct, alternat… ▽ More In this paper, we present recent stability results with explicit and dimensionally sharp constants and optimal norms for the Sobolev inequality and for the Gaussian logarithmic Sobolev inequality obtained by the authors in [24]. The stability for the Gaussian logarithmic Sobolev inequality was obtained as a byproduct of the stability for the Sobolev inequality. Here we give a new, direct, alternative proof. We also discuss improved versions of interpolation inequalities based on the carré du champ method. △ Less

Submitted 22 April, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

MSC Class: 49J40 (Primary) 26D10; 35A23; 58E35 (Secondary)

arXiv:2401.18051 [pdf, other]

The Nature of the Action Potential

Authors: Vitaly L. Galinsky, Lawrence R. Frank

Abstract: We demonstrate that our recently developed theory of electric field wave propagation in anisotropic and inhomogeneous brain tissues, which has been shown to explain a broad range of observed coherent synchronous brain electrical processes, also explains the spiking behavior of single neurons, thus bridging the gap between the fundamental element of brain electrical activity (the neuron) and large-… ▽ More We demonstrate that our recently developed theory of electric field wave propagation in anisotropic and inhomogeneous brain tissues, which has been shown to explain a broad range of observed coherent synchronous brain electrical processes, also explains the spiking behavior of single neurons, thus bridging the gap between the fundamental element of brain electrical activity (the neuron) and large-scale coherent synchronous electrical activity. Our analysis indicates that the membrane interface of the axonal cellular system can be mathematically described by a nonlinear system with several small parameters. This allows for the rigorous derivation of an accurate yet simpler nonlinear model following the formal small parameter expansion. The resulting action potential model exhibits a smooth, continuous transition from the linear wave oscillatory regime to the nonlinear spiking regime, as well as a critical transition to a non-oscillatory regime. These transitions occur with changes in the criticality parameter and include several different bifurcation types, representative of the various experimentally detected neuron types. This new theory overcomes the limitations of the Hodgkin-Huxley model, such as the inability to explain extracellular spiking, efficient brain synchronization, saltatory conduction along myelinated axons, and a variety of other observed coherent macroscopic brain electrical phenomena. We also show that the standard cable axon theory can be recovered by our approach, using the very crude assumptions of piece-wise homogeneity and isotropy. However, the diffusion process described by the cable equation is not capable of supporting action potential propagation across a wide range of experimentally reported axon parameters. △ Less

Submitted 25 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

arXiv:2312.16482 [pdf, ps, other]

Cwikel-Lieb-Rozenblum type inequalities for Hardy-Schrödinger operator

Authors: Giao Ky Duong, Rupert L. Frank, Thi Minh Thao Le, Phan Thành Nam, Phuoc-Tai Nguyen

Abstract: We prove a Cwikel-Lieb-Rozenblum type inequality for the number of negative eigenvalues of the Hardy-Schrödinger operator $-Δ- (d-2)^2/(4|x|^2) -W(x)$ on $L^2(\mathbb{R}^d)$. The bound is given in terms of a weighted $L^{d/2}-$norm of $W$ which is sharp in both large and small coupling regimes. We also obtain a similar bound for the fractional Laplacian. We prove a Cwikel-Lieb-Rozenblum type inequality for the number of negative eigenvalues of the Hardy-Schrödinger operator $-Δ- (d-2)^2/(4|x|^2) -W(x)$ on $L^2(\mathbb{R}^d)$. The bound is given in terms of a weighted $L^{d/2}-$norm of $W$ which is sharp in both large and small coupling regimes. We also obtain a similar bound for the fractional Laplacian. △ Less

Submitted 19 June, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

Comments: 15 pages, final version to appear in J. Math. Pures Appl

arXiv:2312.11029 [pdf, other]

Picsou: Enabling Efficient Cross-Consensus Communication

Authors: Reginald Frank, Micah Murray, Suyash Gupta, Ethan Xu, Natacha Crooks, Manos Kapritsos

Abstract: Replicated state machines (RSMs) cannot effectively communicate today as there is no formal framework or efficient protocol to do so. To address this issue, we introduce a new primitive, the Cross-Cluster Consistent Broadcast (C3B) and present PICSOU, a practical C3B implementation. PICSOU draws inspiration from networking and TCP to allow two RSMs to communicate with constant metadata overhead in… ▽ More Replicated state machines (RSMs) cannot effectively communicate today as there is no formal framework or efficient protocol to do so. To address this issue, we introduce a new primitive, the Cross-Cluster Consistent Broadcast (C3B) and present PICSOU, a practical C3B implementation. PICSOU draws inspiration from networking and TCP to allow two RSMs to communicate with constant metadata overhead in the failure-free case and minimal number of message resends in the case of failures. PICSOU is flexible and allows both crash fault-tolerant and byzantine fault-tolerant protocols to communicate. At the heart of PICSOU's good performance and generality lies a novel technique we call QUACKs (quorum acknowledgements) that allow nodes in each RSM to precisely determine when messages have definitely been received, or definitely been lost. Our results are promising: we obtain up to 24x better performance than existing all-to-all solutions. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2311.04900 [pdf, other]

How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

Authors: Michael Wilson, Jackson Petty, Robert Frank

Abstract: Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We investigate the degree to which pre-trained Transformer-based large language models (LLMs) represent such relationships, focusing on the domain of argu… ▽ More Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We investigate the degree to which pre-trained Transformer-based large language models (LLMs) represent such relationships, focusing on the domain of argument structure. We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e.g., the active object and passive subject of the verb spray), succeeding by making use of the semantically-organized structure of the embedding space for word embeddings. However, LLMs fail at generalizations between related contexts that have not been observed during pre-training, but which instantiate more abstract, but well-attested structural generalizations (e.g., between the active object and passive subject of an arbitrary verb). Instead, in this case, LLMs show a bias to generalize based on linear order. This finding points to a limitation with current models and points to a reason for which their training is data-intensive.s reported here are available at https://github.com/clay-lab/structural-alternations. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: Accepted to TACL; Presented at EMNLP 2023

arXiv:2311.03595 [pdf, other]

Brief for the Canada House of Commons Study on the Implications of Artificial Intelligence Technologies for the Canadian Labor Force: Generative Artificial Intelligence Shatters Models of AI and Labor

Authors: Morgan R. Frank

Abstract: Exciting advances in generative artificial intelligence (AI) have sparked concern for jobs, education, productivity, and the future of work. As with past technologies, generative AI may not lead to mass unemployment. But, unlike past technologies, generative AI is creative, cognitive, and potentially ubiquitous which makes the usual assumptions of automation predictions ill-suited for today. Exist… ▽ More Exciting advances in generative artificial intelligence (AI) have sparked concern for jobs, education, productivity, and the future of work. As with past technologies, generative AI may not lead to mass unemployment. But, unlike past technologies, generative AI is creative, cognitive, and potentially ubiquitous which makes the usual assumptions of automation predictions ill-suited for today. Existing projections suggest that generative AI will impact workers in occupations that were previously considered immune to automation. As AI's full set of capabilities and applications emerge, policy makers should promote workers' career adaptability. This goal requires improved data on job separations and unemployment by locality and job titles in order to identify early-indicators for the workers facing labor disruption. Further, prudent policy should incentivize education programs to accommodate learning with AI as a tool while preparing students for the demands of the future of work. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2309.08014 [pdf, ps, other]

A div-curl inequality for orthonormal functions

Authors: Rupert L. Frank

Abstract: We prove a bound on the sum of the product of curl-free and divergence-free vector fields. Under appropriate orthonormality conditions our bound scales sublinearly in the number of terms, similar in spirit to Lieb--Thirring inequalities. We prove a bound on the sum of the product of curl-free and divergence-free vector fields. Under appropriate orthonormality conditions our bound scales sublinearly in the number of terms, similar in spirit to Lieb--Thirring inequalities. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: 11 pages; dedicated to Fritz Gesztesy on the occasion of his 70th birthday

arXiv:2308.07917 [pdf, other]

Degenerate Stability of the Caffarelli-Kohn-Nirenberg Inequality along the Felli-Schneider Curve

Authors: Rupert L. Frank, Jonas W. Peteranderl

Abstract: We show that the Caffarelli-Kohn-Nirenberg (CKN) inequality holds with a remainder term that is quartic in the distance to the set of optimizers for the full parameter range of the Felli-Schneider (FS) curve. The fourth power is best possible. This is due to the presence of non-trivial zero modes of the Hessian of the deficit functional along the FS-curve. Following an iterated Bianchi-Egnell stra… ▽ More We show that the Caffarelli-Kohn-Nirenberg (CKN) inequality holds with a remainder term that is quartic in the distance to the set of optimizers for the full parameter range of the Felli-Schneider (FS) curve. The fourth power is best possible. This is due to the presence of non-trivial zero modes of the Hessian of the deficit functional along the FS-curve. Following an iterated Bianchi-Egnell strategy, the heart of our proof is verifying a `secondary non-degeneracy condition'. Our result completes the stability analysis for the CKN-inequality to leading order started by Wei and Wu. Moreover, it is the first instance of degenerate stability for non-constant optimizers and for a non-compact domain. △ Less

Submitted 29 February, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

Comments: 33 pages; final version

Journal ref: Calculus of Variations and PDE 63, 44 (2024)

arXiv:2308.05234 [pdf, other]

Leveraging the Edge and Cloud for V2X-Based Real-Time Object Detection in Autonomous Driving

Authors: Faisal Hawlader, François Robinet, Raphaël Frank

Abstract: Environmental perception is a key element of autonomous driving because the information received from the perception module influences core driving decisions. An outstanding challenge in real-time perception for autonomous driving lies in finding the best trade-off between detection quality and latency. Major constraints on both computation and power have to be taken into account for real-time per… ▽ More Environmental perception is a key element of autonomous driving because the information received from the perception module influences core driving decisions. An outstanding challenge in real-time perception for autonomous driving lies in finding the best trade-off between detection quality and latency. Major constraints on both computation and power have to be taken into account for real-time perception in autonomous vehicles. Larger object detection models tend to produce the best results, but are also slower at runtime. Since the most accurate detectors cannot run in real-time locally, we investigate the possibility of offloading computation to edge and cloud platforms, which are less resource-constrained. We create a synthetic dataset to train object detection models and evaluate different offloading strategies. Using real hardware and network simulations, we compare different trade-offs between prediction quality and end-to-end delay. Since sending raw frames over the network implies additional transmission delays, we also explore the use of JPEG and H.265 compression at varying qualities and measure their impact on prediction metrics. We show that models with adequate compression can be run in real-time on the cloud while outperforming local detection performance. △ Less

Submitted 9 August, 2023; originally announced August 2023.

arXiv:2308.01654 [pdf, other]

Towards a Safe Real-Time Motion Planning Framework for Autonomous Driving Systems: An MPPI Approach

Authors: Mehdi Testouri, Gamal Elghazaly, Raphael Frank

Abstract: Planning safe trajectories in Autonomous Driving Systems (ADS) is a complex problem to solve in real-time. The main challenge to solve this problem arises from the various conditions and constraints imposed by road geometry, semantics and traffic rules, as well as the presence of dynamic agents. Recently, Model Predictive Path Integral (MPPI) has shown to be an effective framework for optimal moti… ▽ More Planning safe trajectories in Autonomous Driving Systems (ADS) is a complex problem to solve in real-time. The main challenge to solve this problem arises from the various conditions and constraints imposed by road geometry, semantics and traffic rules, as well as the presence of dynamic agents. Recently, Model Predictive Path Integral (MPPI) has shown to be an effective framework for optimal motion planning and control in robot navigation in unstructured and highly uncertain environments. In this paper, we formulate the motion planning problem in ADS as a nonlinear stochastic dynamic optimization problem that can be solved using an MPPI strategy. The main technical contribution of this work is a method to handle obstacles within the MPPI formulation safely. In this method, obstacles are approximated by circles that can be easily integrated into the MPPI cost formulation while considering safety margins. The proposed MPPI framework has been efficiently implemented in our autonomous vehicle and experimentally validated using three different primitive scenarios. Experimental results show that generated trajectories are safe, feasible and perfectly achieve the planning objective. The video results as well as the open-source implementation are available at: https://gitlab.uni.lu/360lab-public/mppi △ Less

Submitted 6 May, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

arXiv:2307.16079 [pdf, other]

Counting Negative Eigenvalues for the Magnetic Pauli Operator

Authors: Søren Fournais, Rupert L. Frank, Magnus Goffeng, Ayman Kachmar, Mikael Sundqvist

Abstract: We study the Pauli operator in a two-dimensional, connected domain with Neumann or Robin boundary condition. We prove a sharp lower bound on the number of negative eigenvalues reminiscent of the Aharonov-Casher formula. We apply this lower bound to obtain a new formula on the number of eigenvalues of the magnetic Neumann Laplacian in the semi-classical limit. Our approach relies on reduction to a… ▽ More We study the Pauli operator in a two-dimensional, connected domain with Neumann or Robin boundary condition. We prove a sharp lower bound on the number of negative eigenvalues reminiscent of the Aharonov-Casher formula. We apply this lower bound to obtain a new formula on the number of eigenvalues of the magnetic Neumann Laplacian in the semi-classical limit. Our approach relies on reduction to a boundary Dirac operator. We analyze this boundary operator in two different ways. The first approach uses Atiyah-Patodi-Singer index theory. The second approach relies on a conservation law for the Benjamin-Ono equation. △ Less

Submitted 29 July, 2023; originally announced July 2023.

Comments: 29 pages, 1 figure

MSC Class: 35P15; 58J20; 47A40

arXiv:2307.14089 [pdf, ps, other]

The generalized Wehrl entropy bound in quantitative form

Authors: Rupert L. Frank, Fabio Nicola, Paolo Tilli

Abstract: Lieb and Carlen have shown that mixed states with minimal Wehrl entropy are coherent states. We prove that mixed states with almost minimal Wehrl entropy are almost coherent states. This is proved in a quantitative sense where both the norm and the exponent are optimal and the constant is explicit. We prove a similar bound for generalized Wehrl entropies. As an application, a sharp quantitative fo… ▽ More Lieb and Carlen have shown that mixed states with minimal Wehrl entropy are coherent states. We prove that mixed states with almost minimal Wehrl entropy are almost coherent states. This is proved in a quantitative sense where both the norm and the exponent are optimal and the constant is explicit. We prove a similar bound for generalized Wehrl entropies. As an application, a sharp quantitative form of the log-Sobolev inequality for functions in the Fock space is provided. △ Less

Submitted 1 November, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

Comments: 23 pages. Significantly revised to encompass density matrices as well (not only pure states)

arXiv:2307.13769 [pdf, ps, other]

Minimizers for an aggregation model with attractive-repulsive interaction

Authors: Rupert L. Frank, Ryan W. Matzke

Abstract: We solve explicitly a certain minimization problem for probability measures involving an interaction energy that is repulsive at short distances and attractive at large distances. We complement earlier works by showing that part of the remaining parameter regime all minimizers are uniform distributions on a surface of a sphere, thus showing concentration on a lower dimensional set. Our method of p… ▽ More We solve explicitly a certain minimization problem for probability measures involving an interaction energy that is repulsive at short distances and attractive at large distances. We complement earlier works by showing that part of the remaining parameter regime all minimizers are uniform distributions on a surface of a sphere, thus showing concentration on a lower dimensional set. Our method of proof uses convexity estimates on hypergeometric functions. △ Less

Submitted 23 November, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

Comments: 16 pages; significant strengthening of the main result compared to the previous version

arXiv:2307.08580 [pdf, other]

The Resume Paradox: Greater Language Differences, Smaller Pay Gaps

Authors: Joshua R. Minot, Marc Maier, Bradford Demarest, Nicholas Cheney, Christopher M. Danforth, Peter Sheridan Dodds, Morgan R. Frank

Abstract: Over the past decade, the gender pay gap has remained steady with women earning 84 cents for every dollar earned by men on average. Many studies explain this gap through demand-side bias in the labor market represented through employers' job postings. However, few studies analyze potential bias from the worker supply-side. Here, we analyze the language in millions of US workers' resumes to investi… ▽ More Over the past decade, the gender pay gap has remained steady with women earning 84 cents for every dollar earned by men on average. Many studies explain this gap through demand-side bias in the labor market represented through employers' job postings. However, few studies analyze potential bias from the worker supply-side. Here, we analyze the language in millions of US workers' resumes to investigate how differences in workers' self-representation by gender compare to differences in earnings. Across US occupations, language differences between male and female resumes correspond to 11% of the variation in gender pay gap. This suggests that females' resumes that are semantically similar to males' resumes may have greater wage parity. However, surprisingly, occupations with greater language differences between male and female resumes have lower gender pay gaps. A doubling of the language difference between female and male resumes results in an annual wage increase of $2,797 for the average female worker. This result holds with controls for gender-biases of resume text and we find that per-word bias poorly describes the variance in wage gap. The results demonstrate that textual data and self-representation are valuable factors for improving worker representations and understanding employment inequities. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 24 pages, 15 figures

arXiv:2306.04141 [pdf, other]

doi 10.1126/science.adh4451

Art and the science of generative AI: A deeper dive

Authors: Ziv Epstein, Aaron Hertzmann, Laura Herman, Robert Mahari, Morgan R. Frank, Matthew Groh, Hope Schroeder, Amy Smith, Memo Akten, Jessica Fjeld, Hany Farid, Neil Leach, Alex Pentland, Olga Russakovsky

Abstract: A new class of tools, colloquially called generative AI, can produce high-quality artistic media for visual arts, concept art, music, fiction, literature, video, and animation. The generative capabilities of these tools are likely to fundamentally alter the creative processes by which creators formulate ideas and put them into production. As creativity is reimagined, so too may be many sectors of… ▽ More A new class of tools, colloquially called generative AI, can produce high-quality artistic media for visual arts, concept art, music, fiction, literature, video, and animation. The generative capabilities of these tools are likely to fundamentally alter the creative processes by which creators formulate ideas and put them into production. As creativity is reimagined, so too may be many sectors of society. Understanding the impact of generative AI - and making policy decisions around it - requires new interdisciplinary scientific inquiry into culture, economics, law, algorithms, and the interaction of technology and creativity. We argue that generative AI is not the harbinger of art's demise, but rather is a new medium with its own distinct affordances. In this vein, we consider the impacts of this new medium on creators across four themes: aesthetics and culture, legal questions of ownership and credit, the future of creative work, and impacts on the contemporary media ecosystem. Across these themes, we highlight key research questions and directions to inform policy and beneficial uses of the technology. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: This white paper is an expanded version of Epstein et al 2023 published in Science Perspectives on July 16, 2023 which you can find at the following DOI: 10.1126/science.adh4451

arXiv:2304.03115 [pdf, ps, other]

The sharp Sobolev inequality and its stability: An introduction

Authors: Rupert L. Frank

Abstract: These notes are an extended version of a series of lectures given at the CIME Summer School in Cetraro in June 2022. The goal is to explain questions about optimal functional inequalities on the example of the sharp Sobolev inequality and its fractional generalizations. Topics covered include compactness theorems for optimizing sequences, characterization of optimizers and quantitative stability. These notes are an extended version of a series of lectures given at the CIME Summer School in Cetraro in June 2022. The goal is to explain questions about optimal functional inequalities on the example of the sharp Sobolev inequality and its fractional generalizations. Topics covered include compactness theorems for optimizing sequences, characterization of optimizers and quantitative stability. △ Less

Submitted 24 April, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

Comments: 52 pages; lecture notes from a course held at the CIME Summer School in June 2022; minor changes in v2

arXiv:2303.06104 [pdf, ps, other]

Weighted CLR type bounds in two dimensions

Authors: Rupert L. Frank, Ari Laptev, Larry Read

Abstract: We derive weighted versions of the Cwikel-Lieb-Rozenblum inequality for the Schrödinger operator in two dimensions with a nontrivial Aharonov-Bohm magnetic field. Our bounds capture the optimal dependence on the flux and we identify a class of long-range potentials that saturate our bounds in the strong coupling limit. We also extend our analysis to the two-dimensional Schrödinger operator acting… ▽ More We derive weighted versions of the Cwikel-Lieb-Rozenblum inequality for the Schrödinger operator in two dimensions with a nontrivial Aharonov-Bohm magnetic field. Our bounds capture the optimal dependence on the flux and we identify a class of long-range potentials that saturate our bounds in the strong coupling limit. We also extend our analysis to the two-dimensional Schrödinger operator acting on antisymmetric functions and obtain similar results. △ Less

Submitted 10 March, 2023; originally announced March 2023.

Comments: 16 pages

arXiv:2303.05348 [pdf, ps, other]

doi 10.1016/j.jfa.2023.110104

On Sobolev norms involving Hardy operators in a half-space

Authors: Rupert L. Frank, Konstantin Merz

Abstract: We consider Hardy operators on the half-space, that is, ordinary and fractional Schrödinger operators with potentials given by the appropriate power of the distance to the boundary. We show that the scales of homogeneous Sobolev spaces generated by the Hardy operators and by the fractional Laplacian are comparable with each other when the coupling constant is not too large in a quantitative sense.… ▽ More We consider Hardy operators on the half-space, that is, ordinary and fractional Schrödinger operators with potentials given by the appropriate power of the distance to the boundary. We show that the scales of homogeneous Sobolev spaces generated by the Hardy operators and by the fractional Laplacian are comparable with each other when the coupling constant is not too large in a quantitative sense. Our results extend those in the whole Euclidean space and rely on recent heat kernel bounds. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: 40 pages

Journal ref: J. Funct. Anal. 285 (2023), no. 10, paper number 110104

arXiv:2302.08822 [pdf]

False perspectives on human language: why statistics needs linguistics

Authors: Matteo Greco, Andrea Cometa, Fiorenzo Artoni, Robert Frank, Andrea Moro

Abstract: A sharp tension exists about the nature of human language between two opposite parties: those who believe that statistical surface distributions, in particular using measures like surprisal, provide a better understanding of language processing, vs. those who believe that discrete hierarchical structures implementing linguistic information such as syntactic ones are a better tool. In this paper, w… ▽ More A sharp tension exists about the nature of human language between two opposite parties: those who believe that statistical surface distributions, in particular using measures like surprisal, provide a better understanding of language processing, vs. those who believe that discrete hierarchical structures implementing linguistic information such as syntactic ones are a better tool. In this paper, we show that this dichotomy is a false one. Relying on the fact that statistical measures can be defined on the basis of either structural or non-structural models, we provide empirical evidence that only models of surprisal that reflect syntactic structure are able to account for language regularities. △ Less

Submitted 17 February, 2023; originally announced February 2023.

arXiv:2301.11462 [pdf, other]

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

Authors: Aditya Yedetore, Tal Linzen, Robert Frank, R. Thomas McCoy

Abstract: When acquiring syntax, children consistently choose hierarchical rules over competing non-hierarchical possibilities. Is this preference due to a learning bias for hierarchical structure, or due to more general biases that interact with hierarchical cues in children's linguistic input? We explore these possibilities by training LSTMs and Transformers - two types of neural networks without a hierar… ▽ More When acquiring syntax, children consistently choose hierarchical rules over competing non-hierarchical possibilities. Is this preference due to a learning bias for hierarchical structure, or due to more general biases that interact with hierarchical cues in children's linguistic input? We explore these possibilities by training LSTMs and Transformers - two types of neural networks without a hierarchical bias - on data similar in quantity and content to children's linguistic input: text from the CHILDES corpus. We then evaluate what these models have learned about English yes/no questions, a phenomenon for which hierarchical structure is crucial. We find that, though they perform well at capturing the surface statistics of child-directed speech (as measured by perplexity), both model types generalize in a way more consistent with an incorrect linear rule than the correct hierarchical rule. These results suggest that human-like generalization from text alone requires stronger biases than the general sequence-processing biases of standard neural network architectures. △ Less

Submitted 6 June, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: 10 pages plus references and appendices; accepted to ACL

ACM Class: J.4; I.2.7

arXiv:2211.15495 [pdf, other]

FastCycle: A Message Sharing Framework for Modular Automated Driving Systems

Authors: Mehdi Testouri, Gamal Elghazaly, Raphael Frank

Abstract: Automated Driving Systems (ADS) have rapidly evolved in recent years and their architecture becomes sophisticated. Ensuring robustness, reliability and safety of performance is particularly important. The main challenge in building an ADS is the ability to meet certain stringent performance requirements in terms of both making safe operational decisions and finishing processing in real-time. Middl… ▽ More Automated Driving Systems (ADS) have rapidly evolved in recent years and their architecture becomes sophisticated. Ensuring robustness, reliability and safety of performance is particularly important. The main challenge in building an ADS is the ability to meet certain stringent performance requirements in terms of both making safe operational decisions and finishing processing in real-time. Middlewares play a crucial role to handle these requirements in ADS. The way middlewares share data between the different system components has a direct impact on the overall performance, particularly the latency overhead. To this end, this paper presents FastCycle as a lightweight multi-threaded zero-copy messaging broker to meet the requirements of a high fidelity ADS in terms of modularity, real-time performance and security. We discuss the architecture and the main features of the proposed framework. Evaluation of the proposed framework based on standard metrics in comparison with popular middlewares used in robotics and automated driving shows the improved performance of our framework. The implementation of FastCycle and the associated comparisons with other frameworks are open sourced. △ Less

Submitted 28 November, 2022; originally announced November 2022.

arXiv:2210.14798 [pdf, ps, other]

Sharp inequalities for coherent states and their optimizers

Authors: Rupert L. Frank

Abstract: We are interested in sharp functional inequalities for the coherent state transform related to the Wehrl conjecture and its generalizations. This conjecture was settled by Lieb in the case of the Heisenberg group and then by Lieb and Solovej for SU(2) and by Kulikov for SU(1,1) and the affine group. In this paper, we give alternative proofs and characterize, for the first time, the optimizers in t… ▽ More We are interested in sharp functional inequalities for the coherent state transform related to the Wehrl conjecture and its generalizations. This conjecture was settled by Lieb in the case of the Heisenberg group and then by Lieb and Solovej for SU(2) and by Kulikov for SU(1,1) and the affine group. In this paper, we give alternative proofs and characterize, for the first time, the optimizers in the general case. We also extend the recent Faber--Krahn-type inequality for Heisenberg coherent states, due to Nicola and Tilli, to the SU(2) and SU(1,1) cases. Finally, we prove a family of reverse Hölder inequalities for polynomials, conjectured by Bodmann. △ Less

Submitted 26 October, 2022; originally announced October 2022.

Comments: 31 pages; dedicated to David Jerison, in admiration, on the occasion of his 70th birthday

arXiv:2209.12559 [pdf, ps, other]

Asymptotics of singular values for quantum derivatives

Authors: Rupert L. Frank, Fedor Sukochev, Dmitriy Zanin

Abstract: We obtain Weyl type asymptotics for the quantised derivative $\dbar f$ of a function $f$ from the homgeneous Sobolev space $\dot{W}^1_d(\mathbb{R}^d)$ on $\mathbb{R}^d.$ The asymptotic coefficient $\|\nabla f\|_{L_d(\mathbb R^d)}$ is equivalent to the norm of $\dbar f$ in the principal ideal $\mathcal{L}_{d,\infty},$ thus, providing a non-asymptotic, uniform bound on the spectrum of $\dbar f.$ Our… ▽ More We obtain Weyl type asymptotics for the quantised derivative $\dbar f$ of a function $f$ from the homgeneous Sobolev space $\dot{W}^1_d(\mathbb{R}^d)$ on $\mathbb{R}^d.$ The asymptotic coefficient $\|\nabla f\|_{L_d(\mathbb R^d)}$ is equivalent to the norm of $\dbar f$ in the principal ideal $\mathcal{L}_{d,\infty},$ thus, providing a non-asymptotic, uniform bound on the spectrum of $\dbar f.$ Our methods are based on the $C^{\ast}$-algebraic notion of the principal symbol mapping on $\mathbb{R}^d$, as developed recently by the last two authors and collaborators. △ Less

Submitted 26 September, 2022; originally announced September 2022.

Comments: 41 pages; accepted for publication in Trans. Amer. Math. Soc

arXiv:2209.08651 [pdf, ps, other]

Sharp stability for Sobolev and log-Sobolev inequalities, with optimal dimensional dependence

Authors: Jean Dolbeault, Maria J. Esteban, Alessio Figalli, Rupert L. Frank, Michael Loss

Abstract: We prove a sharp quantitative version for the stability of the Sobolev inequality with explicit constants. Moreover, the constants have the correct behavior in the limit of large dimensions, which allows us to deduce an optimal quantitative stability estimate for the Gaussian log-Sobolev inequality with an explicit dimension-free constant. Our proofs rely on several ingredients such as competing s… ▽ More We prove a sharp quantitative version for the stability of the Sobolev inequality with explicit constants. Moreover, the constants have the correct behavior in the limit of large dimensions, which allows us to deduce an optimal quantitative stability estimate for the Gaussian log-Sobolev inequality with an explicit dimension-free constant. Our proofs rely on several ingredients such as competing symmetries, a flow based on continuous Steiner symmetrization that interpolates continuously between a function and its symmetric decreasing rearrangement, and refined estimates on the Sobolev functional in the neighborhood of the optimal Aubin--Talenti functions. △ Less

Submitted 15 July, 2023; v1 submitted 18 September, 2022; originally announced September 2022.

MSC Class: 49J40; 26D10; 35A23

arXiv:2208.04688 [pdf, other]

Connected Vehicle Platforms for Dynamic Insurance

Authors: Christian Colot, Francois Robinet, Geoffrey Nichils, Raphael Frank

Abstract: Following a regulatory change in Europe which mandates that car manufacturers include an eCall system in new vehicles, many car manufacturers are adding additional services on top, so that more and more cars become connected vehicles and act like IoT sensors. In the following study, we analyse the maturity level of this new technology to build insurance products that would take vehicle usage into… ▽ More Following a regulatory change in Europe which mandates that car manufacturers include an eCall system in new vehicles, many car manufacturers are adding additional services on top, so that more and more cars become connected vehicles and act like IoT sensors. In the following study, we analyse the maturity level of this new technology to build insurance products that would take vehicle usage into account. For this, the connectivity of recent cars a-priori eligible has been first tested. Then, an ad-hoc platform has been designed to collect driving data. In particular, 4 cars have been connected to this platform for periods of over one month. Our results highlight that, while this technological innovation appears very promising in the future, the pricing, the lack of uniformity of data collected and the enrollment process are currently three pain points that should be addressed to offer large-scale opportunities. In the meantime, this technology might still be used for high value use cases such as the insurance of luxurious cars. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Working paper

arXiv:2207.05153 [pdf, ps, other]

Rearrangement methods in the work of Elliott Lieb

Authors: Rupert L. Frank

Abstract: We review some topics in the theory of symmetric decreasing rearrangements with a particular focus on Lieb's fundamental contributions. Topics covered include the Brascamp--Lieb--Luttinger theorem, the sharp Young and Hardy--Littlewood--Sobolev inequalities, as well as the continuity of the rearrangement map on Sobolev spaces. We review some topics in the theory of symmetric decreasing rearrangements with a particular focus on Lieb's fundamental contributions. Topics covered include the Brascamp--Lieb--Luttinger theorem, the sharp Young and Hardy--Littlewood--Sobolev inequalities, as well as the continuity of the rearrangement map on Sobolev spaces. △ Less

Submitted 11 July, 2022; originally announced July 2022.

Comments: 22 pages; Dedicated, in great admiration, to Elliott Lieb on the occasion of his 90th birthday

arXiv:2207.05131 [pdf, ps, other]

The work of Elliott Lieb

Authors: Rupert L. Frank

Abstract: On the occasion of Elliott Lieb being awarded the Gauss Prize 2022, we give a non-technical overview over some of his seminal works in mathematical physics. We emphasize, in particular, his work on Coulomb many-body systems and functional inequalities. On the occasion of Elliott Lieb being awarded the Gauss Prize 2022, we give a non-technical overview over some of his seminal works in mathematical physics. We emphasize, in particular, his work on Coulomb many-body systems and functional inequalities. △ Less

Submitted 11 July, 2022; originally announced July 2022.

Comments: 16 pages; contribution to the Proceedings of the ICM 2022

arXiv:2206.04615 [pdf, other]

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting. △ Less

Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

arXiv:2204.10081 [pdf, ps, other]

doi 10.1007/s11005-023-01631-9

The Scott conjecture for large Coulomb systems: a review

Authors: Rupert L. Frank, Konstantin Merz, Heinz Siedentop

Abstract: We review some older and more recent results concerning the energy and particle distribution in ground states of heavy Coulomb systems. The reviewed results are asymptotic in nature: they describe properties of many-particle systems in the limit of a large number of particles. Particular emphasis is put on models that take relativistic kinematics into account. While non-relativistic models are typ… ▽ More We review some older and more recent results concerning the energy and particle distribution in ground states of heavy Coulomb systems. The reviewed results are asymptotic in nature: they describe properties of many-particle systems in the limit of a large number of particles. Particular emphasis is put on models that take relativistic kinematics into account. While non-relativistic models are typically rather well understood, this is generally not the case for relativistic ones and leads to a variety of open questions. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Comments: 62 pages

Journal ref: Letters in Mathematical Physics, Volume 113 (2023), no. 1, Paper No. 11

arXiv:2204.06618 [pdf, ps, other]

Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity

Authors: Yiding Hao, Dana Angluin, Robert Frank

Abstract: This paper analyzes three formal models of Transformer encoders that differ in the form of their self-attention mechanism: unique hard attention (UHAT); generalized unique hard attention (GUHAT), which generalizes UHAT; and averaging hard attention (AHAT). We show that UHAT and GUHAT Transformers, viewed as string acceptors, can only recognize formal languages in the complexity class AC$^0$, the c… ▽ More This paper analyzes three formal models of Transformer encoders that differ in the form of their self-attention mechanism: unique hard attention (UHAT); generalized unique hard attention (GUHAT), which generalizes UHAT; and averaging hard attention (AHAT). We show that UHAT and GUHAT Transformers, viewed as string acceptors, can only recognize formal languages in the complexity class AC$^0$, the class of languages recognizable by families of Boolean circuits of constant depth and polynomial size. This upper bound subsumes Hahn's (2020) results that GUHAT cannot recognize the DYCK languages or the PARITY language, since those languages are outside AC$^0$ (Furst et al., 1984). In contrast, the non-AC$^0$ languages MAJORITY and DYCK-1 are recognizable by AHAT networks, implying that AHAT can recognize languages that UHAT and GUHAT cannot. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Comments: To appear in Transactions of the Association for Computational Linguistics

arXiv:2204.00877 [pdf, ps, other]

An improved one-dimensional Hardy inequality

Authors: Rupert L. Frank, Ari Laptev, Timo Weidl

Abstract: We prove a one-dimensional Hardy inequality on the halfline with sharp constant, which improves the classical form of this inequality. As a consequence of this new inequality we can rederive known doubly weighted Hardy inequalities. Our motivation comes from the theory of Schrödinger operators and we explain the use of Hardy inequalities in that context. We prove a one-dimensional Hardy inequality on the halfline with sharp constant, which improves the classical form of this inequality. As a consequence of this new inequality we can rederive known doubly weighted Hardy inequalities. Our motivation comes from the theory of Schrödinger operators and we explain the use of Hardy inequalities in that context. △ Less

Submitted 2 April, 2022; originally announced April 2022.

Comments: 19 pages; Dedicated, in admiration, to V. Maz'ya on the occasion of his 85th birthday

arXiv:2203.09397 [pdf, other]

Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models

Authors: Aaron Mueller, Robert Frank, Tal Linzen, Luheng Wang, Sebastian Schuster

Abstract: Relations between words are governed by hierarchical structure rather than linear ordering. Sequence-to-sequence (seq2seq) models, despite their success in downstream NLP applications, often fail to generalize in a hierarchy-sensitive manner when performing syntactic transformations - for example, transforming declarative sentences into questions. However, syntactic evaluations of seq2seq models h… ▽ More Relations between words are governed by hierarchical structure rather than linear ordering. Sequence-to-sequence (seq2seq) models, despite their success in downstream NLP applications, often fail to generalize in a hierarchy-sensitive manner when performing syntactic transformations - for example, transforming declarative sentences into questions. However, syntactic evaluations of seq2seq models have only observed models that were not pre-trained on natural language data before being trained to perform syntactic transformations, in spite of the fact that pre-training has been found to induce hierarchical linguistic generalizations in language models; in other words, the syntactic capabilities of seq2seq models may have been greatly understated. We address this gap using the pre-trained seq2seq models T5 and BART, as well as their multilingual variants mT5 and mBART. We evaluate whether they generalize hierarchically on two transformations in two languages: question formation and passivization in English and German. We find that pre-trained seq2seq models generalize hierarchically when performing syntactic transformations, whereas models trained from scratch on syntactic transformations do not. This result presents evidence for the learnability of hierarchical syntactic information from non-annotated natural language text while also demonstrating that seq2seq models are capable of syntactic generalization, though only after exposure to much more language data than human learners receive. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: Accepted to Findings of ACL 2022

arXiv:2203.01001 [pdf, ps, other]

A characterization of $\dot W^{1,p}(\mathbb R^d)$

Authors: Rupert L. Frank

Abstract: For $1<p<\infty$ we give a characterization of the Sobolev space $\dot W^{1,p}(\mathbb R^d)$ in terms of the oscillations of a function on balls of varying centers and radii. Our work is motivated both by the study of trace ideal properties of commutators with singular integral operators and by work of Nguyen and by Brezis, Van Schaftingen and Yung on derivative-free characterizations of Sobolev s… ▽ More For $1<p<\infty$ we give a characterization of the Sobolev space $\dot W^{1,p}(\mathbb R^d)$ in terms of the oscillations of a function on balls of varying centers and radii. Our work is motivated both by the study of trace ideal properties of commutators with singular integral operators and by work of Nguyen and by Brezis, Van Schaftingen and Yung on derivative-free characterizations of Sobolev spaces. △ Less

Submitted 9 July, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: 16 pages; Dedicated to V. Maz'ya on the occasion of his 85th birthday; accepted version with typos corrected

arXiv:2202.03611 [pdf, other]

Do Language Models Learn Position-Role Mappings?

Authors: Jackson Petty, Michael Wilson, Robert Frank

Abstract: How is knowledge of position-role mappings in natural language learned? We explore this question in a computational setting, testing whether a variety of well-performing pertained language models (BERT, RoBERTa, and DistilBERT) exhibit knowledge of these mappings, and whether this knowledge persists across alternations in syntactic, structural, and lexical alternations. In Experiment 1, we show th… ▽ More How is knowledge of position-role mappings in natural language learned? We explore this question in a computational setting, testing whether a variety of well-performing pertained language models (BERT, RoBERTa, and DistilBERT) exhibit knowledge of these mappings, and whether this knowledge persists across alternations in syntactic, structural, and lexical alternations. In Experiment 1, we show that these neural models do indeed recognize distinctions between theme and recipient roles in ditransitive constructions, and that these distinct patterns are shared across construction type. We strengthen this finding in Experiment 2 by showing that fine-tuning these language models on novel theme- and recipient-like tokens in one paradigm allows the models to make correct predictions about their placement in other paradigms, suggesting that the knowledge of these mappings is shared rather than independently learned. We do, however, observe some limitations of this generalization when tasks involve constructions with novel ditransitive verbs, hinting at a degree of lexical specificity which underlies model performance. △ Less

Submitted 7 February, 2022; originally announced February 2022.

Comments: To appear in the BUCLD 46 Proceedings

arXiv:2202.00323 [pdf, ps, other]

Weyl's law under minimal assumptions

Authors: Rupert L. Frank

Abstract: We show that Weyl's law for the number and the Riesz means of negative eigenvalues of Schrödinger operators remains valid under minimal assumptions on the potential, the vector potential and the underlying domain. We show that Weyl's law for the number and the Riesz means of negative eigenvalues of Schrödinger operators remains valid under minimal assumptions on the potential, the vector potential and the underlying domain. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: 17 pages; dedicated to the memory of Sergey Naboko

arXiv:2201.09605 [pdf, ps, other]

An inequality for the normal derivative of the Lane-Emden ground state

Authors: Rupert L. Frank, Simon Larson

Abstract: We consider Lane-Emden ground states with polytropic index $0\leq q-1\leq 1$, that is, minimizers of the Dirichlet integral among $L^q$-normalized functions. Our main result is a sharp lower bound on the $L^2$-norm of the normal derivative in terms of the energy, which implies a corresponding isoperimetric inequality. Our bound holds for arbitrary bounded open Lipschitz sets… ▽ More We consider Lane-Emden ground states with polytropic index $0\leq q-1\leq 1$, that is, minimizers of the Dirichlet integral among $L^q$-normalized functions. Our main result is a sharp lower bound on the $L^2$-norm of the normal derivative in terms of the energy, which implies a corresponding isoperimetric inequality. Our bound holds for arbitrary bounded open Lipschitz sets $Ω\subset\mathbb{R}^d$, without assuming convexity. △ Less

Submitted 28 July, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

arXiv:2201.03610 [pdf, ps, other]

A sharp criterion for zero modes of the Dirac equation

Authors: Rupert L. Frank, Michael Loss

Abstract: It is shown that $\Vert A \Vert_{L^d}^2 \ge \frac{d}{d-2}\, S_d$ is a necessary condition for the existence of a nontrivial solution of the Dirac equation $γ\cdot (-i\nabla -A)ψ= 0$ in $d$ dimensions. Here, $S_d$ is the sharp Sobolev constant. If $d$ is odd and $\Vert A \Vert_{L^d}^2= \frac{d}{d-2}\, S_d$, then there exist vector potentials that allow for zero modes. A complete classification of t… ▽ More It is shown that $\Vert A \Vert_{L^d}^2 \ge \frac{d}{d-2}\, S_d$ is a necessary condition for the existence of a nontrivial solution of the Dirac equation $γ\cdot (-i\nabla -A)ψ= 0$ in $d$ dimensions. Here, $S_d$ is the sharp Sobolev constant. If $d$ is odd and $\Vert A \Vert_{L^d}^2= \frac{d}{d-2}\, S_d$, then there exist vector potentials that allow for zero modes. A complete classification of these vector potentials and their corresponding zero modes is given. △ Less

Submitted 10 January, 2022; originally announced January 2022.

Comments: LaTeX, 26 pages

MSC Class: Primary: 35F50; Secondary: 81V45; 47J10

arXiv:2111.07479 [pdf, other]

Neuronal avalanches and critical dynamics of brain waves

Authors: Vitaly L. Galinsky, Lawrence R. Frank

Abstract: Analytical expressions for scaling of brain wave spectra derived from the general nonlinear wave Hamiltonian form show excellent agreement with experimental "neuronal avalanche" data. The theory of the weakly evanescent nonlinear brain wave dynamics reveals the underlying collective processes hidden behind the phenomenological statistical description of the neuronal avalanches and connects togethe… ▽ More Analytical expressions for scaling of brain wave spectra derived from the general nonlinear wave Hamiltonian form show excellent agreement with experimental "neuronal avalanche" data. The theory of the weakly evanescent nonlinear brain wave dynamics reveals the underlying collective processes hidden behind the phenomenological statistical description of the neuronal avalanches and connects together the whole range of brain activity states, from oscillatory wave-like modes, to neuronal avalanches, to incoherent spiking, showing that the neuronal avalanches are just the manifestation of the different nonlinear side of wave processes abundant in cortical tissue. In a more broad way these results show that a system of wave modes interacting through all possible combinations of the third order nonlinear terms described by a general wave Hamiltonian necessarily produces anharmonic wave modes with temporal and spatial scaling properties that follow scale free power laws. To the best of our knowledge this was never reported in the physical literature and may be applicable to many physical systems that involve wave processes and not just to neuronal avalanches. △ Less

Submitted 14 November, 2021; originally announced November 2021.

Comments: 6 pages, 2 figure

arXiv:2110.13317 [pdf, other]

Exposure of occupations to technologies of the fourth industrial revolution

Authors: Benjamin Meindl, Morgan R. Frank, Joana Mendonça

Abstract: The fourth industrial revolution (4IR) is likely to have a substantial impact on the economy. Companies need to build up capabilities to implement new technologies, and automation may make some occupations obsolete. However, where, when, and how the change will happen remain to be determined. Robust empirical indicators of technological progress linked to occupations can help to illuminate this ch… ▽ More The fourth industrial revolution (4IR) is likely to have a substantial impact on the economy. Companies need to build up capabilities to implement new technologies, and automation may make some occupations obsolete. However, where, when, and how the change will happen remain to be determined. Robust empirical indicators of technological progress linked to occupations can help to illuminate this change. With this aim, we provide such an indicator based on patent data. Using natural language processing, we calculate patent exposure scores for more than 900 occupations, which represent the technological progress related to them. To provide a lens on the impact of the 4IR, we differentiate between traditional and 4IR patent exposure. Our method differs from previous approaches in that it both accounts for the diversity of task-level patent exposures within an occupation and reflects work activities more accurately. We find that exposure to 4IR patents differs from traditional patent exposure. Manual tasks, and accordingly occupations such as construction and production, are exposed mainly to traditional (non-4IR) patents but have low exposure to 4IR patents. The analysis suggests that 4IR technologies may have a negative impact on job growth; this impact appears 10 to 20 years after patent filing. Further, we compared the 4IR exposure to other automation and AI exposure scores. Whereas many measures refer to theoretical automation potential, our patent-based indicator reflects actual technology diffusion. Our work not only allows analyses of the impact of 4IR technologies as a whole, but also provides exposure scores for more than 300 technology fields, such as AI and smart office technologies. Finally, the work provides a general mapping of patents to tasks and occupations, which enables future researchers to construct individual exposure measures. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: 65 pages, 18 figures

Showing 1–50 of 283 results for author: Frank, R