-
Online Stackelberg Optimization via Nonlinear Control
Authors:
William Brown,
Christos Papadimitriou,
Tim Roughgarden
Abstract:
In repeated interaction problems with adaptive agents, our objective often requires anticipating and optimizing over the space of possible agent responses. We show that many problems of this form can be cast as instances of online (nonlinear) control which satisfy \textit{local controllability}, with convex losses over a bounded state space which encodes agent behavior, and we introduce a unified…
▽ More
In repeated interaction problems with adaptive agents, our objective often requires anticipating and optimizing over the space of possible agent responses. We show that many problems of this form can be cast as instances of online (nonlinear) control which satisfy \textit{local controllability}, with convex losses over a bounded state space which encodes agent behavior, and we introduce a unified algorithmic framework for tractable regret minimization in such cases. When the instance dynamics are known but otherwise arbitrary, we obtain oracle-efficient $O(\sqrt{T})$ regret by reduction to online convex optimization, which can be made computationally efficient if dynamics are locally \textit{action-linear}. In the presence of adversarial disturbances to the state, we give tight bounds in terms of either the cumulative or per-round disturbance magnitude (for \textit{strongly} or \textit{weakly} locally controllable dynamics, respectively). Additionally, we give sublinear regret results for the cases of unknown locally action-linear dynamics as well as for the bandit feedback setting. Finally, we demonstrate applications of our framework to well-studied problems including performative prediction, recommendations for adaptive agents, adaptive pricing of real-valued goods, and repeated gameplay against no-regret learners, directly yielding extensions beyond prior results in each case.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Physics-informed neural networks for parameter learning of wildfire spreading
Authors:
Konstantinos Vogiatzoglou,
Costas Papadimitriou,
Vasilis Bontozoglou,
Konstantinos Ampountolas
Abstract:
Wildland fires pose terrifying natural hazards, underscoring the urgent need to develop data-driven and physics-informed digital twins for wildfire prevention, monitoring, intervention, and response. In this direction of research, this work introduces a physics-informed neural network (PiNN) to learn the unknown parameters of an interpretable wildfire spreading model. The considered wildfire sprea…
▽ More
Wildland fires pose terrifying natural hazards, underscoring the urgent need to develop data-driven and physics-informed digital twins for wildfire prevention, monitoring, intervention, and response. In this direction of research, this work introduces a physics-informed neural network (PiNN) to learn the unknown parameters of an interpretable wildfire spreading model. The considered wildfire spreading model integrates fundamental physical laws articulated by key model parameters, essential for capturing the complex behavior of wildfires. The proposed machine learning approach leverages the theory of artificial neural networks with the physical constraints governing wildfire dynamics, such as the first principles of mass and energy conservation. Training of the PiNN for physics-informed parameter identification is realized using data of the temporal evolution of one- and two-dimensional (plane surface) fire fronts that have been obtained from a high-fidelity simulator of the wildfire spreading model under consideration. The parameter learning results demonstrate the remarkable predictive ability of the proposed PiNN in uncovering the unknown coefficients in both the one- and two-dimensional fire spreading scenarios. Additionally, this methodology exhibits robustness by identifying the same parameters in the presence of noisy data. The proposed framework is envisioned to be incorporated in a physics-informed digital twin for intelligent wildfire management and risk assessment.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Coin-Flipping In The Brain: Statistical Learning with Neuronal Assemblies
Authors:
Max Dabagia,
Daniel Mitropolsky,
Christos H. Papadimitriou,
Santosh S. Vempala
Abstract:
How intelligence arises from the brain is a central problem in science. A crucial aspect of intelligence is dealing with uncertainty -- developing good predictions about one's environment, and converting these predictions into decisions. The brain itself seems to be noisy at many levels, from chemical processes which drive development and neuronal activity to trial variability of responses to stim…
▽ More
How intelligence arises from the brain is a central problem in science. A crucial aspect of intelligence is dealing with uncertainty -- developing good predictions about one's environment, and converting these predictions into decisions. The brain itself seems to be noisy at many levels, from chemical processes which drive development and neuronal activity to trial variability of responses to stimuli. One hypothesis is that the noise inherent to the brain's mechanisms is used to sample from a model of the world and generate predictions. To test this hypothesis, we study the emergence of statistical learning in NEMO, a biologically plausible computational model of the brain based on stylized neurons and synapses, plasticity, and inhibition, and giving rise to assemblies -- a group of neurons whose coordinated firing is tantamount to recalling a location, concept, memory, or other primitive item of cognition. We show in theory and simulation that connections between assemblies record statistics, and ambient noise can be harnessed to make probabilistic choices between assemblies. This allows NEMO to create internal models such as Markov chains entirely from the presentation of sequences of stimuli. Our results provide a foundation for biologically plausible probabilistic computation, and add theoretical support to the hypothesis that noise is a useful component of the brain's mechanism for cognition.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Representative electricity price profiles for European day-ahead and intraday spot markets
Authors:
Chrysanthi Papadimitriou,
Jan C. Schulze,
Alexander Mitsos
Abstract:
We propose a method to construct representative price profiles of the day-ahead (DA) and the intraday (ID) electricity spot markets and use this method to provide examples of ready-to-use price data sets. In contrast to common scenario generation approaches, the method is deterministic and relies on a small number of degrees of freedom, with the aim to be well defined and easy to use. We thereby t…
▽ More
We propose a method to construct representative price profiles of the day-ahead (DA) and the intraday (ID) electricity spot markets and use this method to provide examples of ready-to-use price data sets. In contrast to common scenario generation approaches, the method is deterministic and relies on a small number of degrees of freedom, with the aim to be well defined and easy to use. We thereby target an enhanced comparability of future research studies on demand-side management and energy cost optimization. We construct the price profiles based on historical time series from the spot markets of interest, e.g., European Power Exchange (EPEX) spot. To this end, we extract key price components from the data while also accounting for known dominant mechanisms in the price variation. Further, the method is able to preserve key statistical features of the historical data (e.g., mean and standard deviation) when constructing the benchmark profile. Finally, our approach ensures comparability of ID and DA price profiles by design, as their cumulative (integral) price can be made identical if needed.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Masked Generative Story Transformer with Character Guidance and Caption Augmentation
Authors:
Christos Papadimitriou,
Giorgos Filandrianos,
Maria Lymperaiou,
Giorgos Stamou
Abstract:
Story Visualization (SV) is a challenging generative vision task, that requires both visual quality and consistency between different frames in generated image sequences. Previous approaches either employ some kind of memory mechanism to maintain context throughout an auto-regressive generation of the image sequence, or model the generation of the characters and their background separately, to imp…
▽ More
Story Visualization (SV) is a challenging generative vision task, that requires both visual quality and consistency between different frames in generated image sequences. Previous approaches either employ some kind of memory mechanism to maintain context throughout an auto-regressive generation of the image sequence, or model the generation of the characters and their background separately, to improve the rendering of characters. On the contrary, we embrace a completely parallel transformer-based approach, exclusively relying on Cross-Attention with past and future captions to achieve consistency. Additionally, we propose a Character Guidance technique to focus on the generation of characters in an implicit manner, by forming a combination of text-conditional and character-conditional logits in the logit space. We also employ a caption-augmentation technique, carried out by a Large Language Model (LLM), to enhance the robustness of our approach. The combination of these methods culminates into state-of-the-art (SOTA) results over various metrics in the most prominent SV benchmark (Pororo-SV), attained with constraint resources while achieving superior computational complexity compared to previous arts. The validity of our quantitative results is supported by a human survey.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
On Limitations of the Transformer Architecture
Authors:
Binghui Peng,
Srini Narayanan,
Christos Papadimitriou
Abstract:
What are the root causes of hallucinations in large language models (LLMs)? We use Communication Complexity to prove that the Transformer layer is incapable of composing functions (e.g., identify a grandparent of a person in a genealogy) if the domains of the functions are large enough; we show through examples that this inability is already empirically present when the domains are quite small. We…
▽ More
What are the root causes of hallucinations in large language models (LLMs)? We use Communication Complexity to prove that the Transformer layer is incapable of composing functions (e.g., identify a grandparent of a person in a genealogy) if the domains of the functions are large enough; we show through examples that this inability is already empirically present when the domains are quite small. We also point out that several mathematical tasks that are at the core of the so-called compositional tasks thought to be hard for LLMs are unlikely to be solvable by Transformers, for large enough instances and assuming that certain well accepted conjectures in the field of Computational Complexity are true.
△ Less
Submitted 26 February, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Monitoring-Supported Value Generation for Managing Structures and Infrastructure Systems
Authors:
Antonios Kamariotis,
Eleni Chatzi,
Daniel Straub,
Nikolaos Dervilis,
Kai Goebel,
Aidan J. Hughes,
Geert Lombaert,
Costas Papadimitriou,
Konstantinos G. Papakonstantinou,
Matteo Pozzi,
Michael Todd,
Keith Worden
Abstract:
To maximize its value, the design, development and implementation of Structural Health Monitoring (SHM) should focus on its role in facilitating decision support. In this position paper, we offer perspectives on the synergy between SHM and decision-making. We propose a classification of SHM use cases aligning with various dimensions that are closely linked to the respective decision contexts. The…
▽ More
To maximize its value, the design, development and implementation of Structural Health Monitoring (SHM) should focus on its role in facilitating decision support. In this position paper, we offer perspectives on the synergy between SHM and decision-making. We propose a classification of SHM use cases aligning with various dimensions that are closely linked to the respective decision contexts. The types of decisions that have to be supported by the SHM system within these settings are discussed along with the corresponding challenges. We provide an overview of different classes of models that are required for integrating SHM in the decision-making process to support management and operation and maintenance of structures and infrastructure systems. Fundamental decision-theoretic principles and state-of-the-art methods for optimizing maintenance and operational decision-making under uncertainty are briefly discussed. Finally, we offer a viewpoint on the appropriate course of action for quantifying, validating and maximizing the added value generated by SHM. This work aspires to synthesize the different perspectives of the SHM, Prognostic Health Management (PHM), and reliability communities, and deliver a roadmap towards monitoring-based decision support.
△ Less
Submitted 4 January, 2024;
originally announced February 2024.
-
The complexity of non-stationary reinforcement learning
Authors:
Christos Papadimitriou,
Binghui Peng
Abstract:
The problem of continual learning in the domain of reinforcement learning, often called non-stationary reinforcement learning, has been identified as an important challenge to the application of reinforcement learning. We prove a worst-case complexity result, which we believe captures this challenge: Modifying the probabilities or the reward of a single state-action pair in a reinforcement learnin…
▽ More
The problem of continual learning in the domain of reinforcement learning, often called non-stationary reinforcement learning, has been identified as an important challenge to the application of reinforcement learning. We prove a worst-case complexity result, which we believe captures this challenge: Modifying the probabilities or the reward of a single state-action pair in a reinforcement learning problem requires an amount of time almost as large as the number of states in order to keep the value function up to date, unless the strong exponential time hypothesis (SETH) is false; SETH is a widely accepted strengthening of the P $\neq$ NP conjecture. Recall that the number of states in current applications of reinforcement learning is typically astronomical. In contrast, we show that just $\textit{adding}$ a new state-action pair is considerably easier to implement.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
The Architecture of a Biologically Plausible Language Organ
Authors:
Daniel Mitropolsky,
Christos H. Papadimitriou
Abstract:
We present a simulated biologically plausible language organ, made up of stylized but realistic neurons, synapses, brain areas, plasticity, and a simplified model of sensory perception. We show through experiments that this model succeeds in an important early step in language acquisition: the learning of nouns, verbs, and their meanings, from the grounded input of only a modest number of sentence…
▽ More
We present a simulated biologically plausible language organ, made up of stylized but realistic neurons, synapses, brain areas, plasticity, and a simplified model of sensory perception. We show through experiments that this model succeeds in an important early step in language acquisition: the learning of nouns, verbs, and their meanings, from the grounded input of only a modest number of sentences. Learning in this system is achieved through Hebbian plasticity, and without backpropagation. Our model goes beyond a parser previously designed in a similar environment, with the critical addition of a biologically plausible account for how language can be acquired in the infant's brain, not just processed by a mature brain.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Computation with Sequences in a Model of the Brain
Authors:
Max Dabagia,
Christos H. Papadimitriou,
Santosh S. Vempala
Abstract:
Even as machine learning exceeds human-level performance on many applications, the generality, robustness, and rapidity of the brain's learning capabilities remain unmatched. How cognition arises from neural activity is a central open question in neuroscience, inextricable from the study of intelligence itself. A simple formal model of neural activity was proposed in Papadimitriou [2020] and has b…
▽ More
Even as machine learning exceeds human-level performance on many applications, the generality, robustness, and rapidity of the brain's learning capabilities remain unmatched. How cognition arises from neural activity is a central open question in neuroscience, inextricable from the study of intelligence itself. A simple formal model of neural activity was proposed in Papadimitriou [2020] and has been subsequently shown, through both mathematical proofs and simulations, to be capable of implementing certain simple cognitive operations via the creation and manipulation of assemblies of neurons. However, many intelligent behaviors rely on the ability to recognize, store, and manipulate temporal sequences of stimuli (planning, language, navigation, to list a few). Here we show that, in the same model, time can be captured naturally as precedence through synaptic weights and plasticity, and, as a result, a range of computations on sequences of assemblies can be carried out. In particular, repeated presentation of a sequence of stimuli leads to the memorization of the sequence through corresponding neural assemblies: upon future presentation of any stimulus in the sequence, the corresponding assembly and its subsequent ones will be activated, one after the other, until the end of the sequence. Finally, we show that any finite state machine can be learned in a similar way, through the presentation of appropriate patterns of sequences. Through an extension of this mechanism, the model can be shown to be capable of universal computation. We support our analysis with a number of experiments to probe the limits of learning in this model in key ways. Taken together, these results provide a concrete hypothesis for the basis of the brain's remarkable abilities to compute and learn, with sequences playing a vital role.
△ Less
Submitted 16 October, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
On the Integration of Physics-Based Machine Learning with Hierarchical Bayesian Modeling Techniques
Authors:
Omid Sedehi,
Antonina M. Kosikova,
Costas Papadimitriou,
Lambros S. Katafygiotis
Abstract:
Machine Learning (ML) has widely been used for modeling and predicting physical systems. These techniques offer high expressive power and good generalizability for interpolation within observed data sets. However, the disadvantage of black-box models is that they underperform under blind conditions since no physical knowledge is incorporated. Physics-based ML aims to address this problem by retain…
▽ More
Machine Learning (ML) has widely been used for modeling and predicting physical systems. These techniques offer high expressive power and good generalizability for interpolation within observed data sets. However, the disadvantage of black-box models is that they underperform under blind conditions since no physical knowledge is incorporated. Physics-based ML aims to address this problem by retaining the mathematical flexibility of ML techniques while incorporating physics. In accord, this paper proposes to embed mechanics-based models into the mean function of a Gaussian Process (GP) model and characterize potential discrepancies through kernel machines. A specific class of kernel function is promoted, which has a connection with the gradient of the physics-based model with respect to the input and parameters and shares similarity with the exact Autocovariance function of linear dynamical systems. The spectral properties of the kernel function enable considering dominant periodic processes originating from physics misspecification. Nevertheless, the stationarity of the kernel function is a difficult hurdle in the sequential processing of long data sets, resolved through hierarchical Bayesian techniques. This implementation is also advantageous to mitigate computational costs, alleviating the scalability of GPs when dealing with sequential data. Using numerical and experimental examples, potential applications of the proposed method to structural dynamics inverse problems are demonstrated.
△ Less
Submitted 28 February, 2023;
originally announced March 2023.
-
Extremal combinatorics, iterated pigeonhole arguments, and generalizations of PPP
Authors:
Amol Pasarkar,
Mihalis Yannakakis,
Christos Papadimitriou
Abstract:
We study the complexity of computational problems arising from existence theorems in extremal combinatorics. For some of these problems, a solution is guaranteed to exist based on an iterated application of the Pigeonhole Principle. This results in the definition of a new complexity class within TFNP, which we call PLC (for "polynomial long choice"). PLC includes all of PPP, as well as numerous pr…
▽ More
We study the complexity of computational problems arising from existence theorems in extremal combinatorics. For some of these problems, a solution is guaranteed to exist based on an iterated application of the Pigeonhole Principle. This results in the definition of a new complexity class within TFNP, which we call PLC (for "polynomial long choice"). PLC includes all of PPP, as well as numerous previously unclassified total problems, including search problems related to Ramsey's theorem, the Sunflower theorem, the Erdős-Ko-Rado lemma, and König's lemma. Whether the first two of these four problems are PLC-complete is an important open question which we pursue; in contrast, we show that the latter two are PPP-complete. Finally, we reframe PPP as an optimization problem, and define a hierarchy of such problems related to Turán's theorem.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
The Computational Complexity of Multi-player Concave Games and Kakutani Fixed Points
Authors:
Christos H. Papadimitriou,
Emmanouil-Vasileios Vlatakis-Gkaragkounis,
Manolis Zampetakis
Abstract:
Kakutani's Fixed Point theorem is a fundamental theorem in topology with numerous applications in game theory and economics. Computational formulations of Kakutani exist only in special cases and are too restrictive to be useful in reductions. In this paper, we provide a general computational formulation of Kakutani's Fixed Point Theorem and we prove that it is PPAD-complete. As an application of…
▽ More
Kakutani's Fixed Point theorem is a fundamental theorem in topology with numerous applications in game theory and economics. Computational formulations of Kakutani exist only in special cases and are too restrictive to be useful in reductions. In this paper, we provide a general computational formulation of Kakutani's Fixed Point Theorem and we prove that it is PPAD-complete. As an application of our theorem we are able to characterize the computational complexity of the following fundamental problems:
(1) Concave Games. Introduced by the celebrated works of Debreu and Rosen in the 1950s and 60s, concave $n$-person games have found many important applications in Economics and Game Theory. We characterize the computational complexity of finding an equilibrium in such games. We show that a general formulation of this problem belongs to PPAD, and that finding an equilibrium is PPAD-hard even for a rather restricted games of this kind: strongly-concave utilities that can be expressed as multivariate polynomials of a constant degree with axis aligned box constraints.
(2) Walrasian Equilibrium. Using Kakutani's fixed point Arrow and Debreu we resolve an open problem related to Walras's theorem on the existence of price equilibria in general economies. There are many results about the PPAD-hardness of Walrasian equilibria, but the inclusion in PPAD is only known for piecewise linear utilities. We show that the problem with general convex utilities is in PPAD.
Along the way we provide a Lipschitz continuous version of Berge's maximum theorem that may be of independent interest.
△ Less
Submitted 25 May, 2023; v1 submitted 15 July, 2022;
originally announced July 2022.
-
Center-Embedding and Constituency in the Brain and a New Characterization of Context-Free Languages
Authors:
Daniel Mitropolsky,
Adiba Ejaz,
Mirah Shi,
Mihalis Yannakakis,
Christos H. Papadimitriou
Abstract:
A computational system implemented exclusively through the spiking of neurons was recently shown capable of syntax, that is, of carrying out the dependency parsing of simple English sentences. We address two of the most important questions left open by that work: constituency (the identification of key parts of the sentence such as the verb phrase) and the processing of dependent sentences, especi…
▽ More
A computational system implemented exclusively through the spiking of neurons was recently shown capable of syntax, that is, of carrying out the dependency parsing of simple English sentences. We address two of the most important questions left open by that work: constituency (the identification of key parts of the sentence such as the verb phrase) and the processing of dependent sentences, especially center-embedded ones. We show that these two aspects of language can also be implemented by neurons and synapses in a way that is compatible with what is known, or widely believed, about the structure and function of the language organ. Surprisingly, the way we implement center embedding points to a new characterization of context-free languages.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
Memory Bounds for Continual Learning
Authors:
Xi Chen,
Christos Papadimitriou,
Binghui Peng
Abstract:
Continual learning, or lifelong learning, is a formidable current challenge to machine learning. It requires the learner to solve a sequence of $k$ different learning tasks, one after the other, while retaining its aptitude for earlier tasks; the continual learner should scale better than the obvious solution of developing and maintaining a separate learner for each of the $k$ tasks. We embark on…
▽ More
Continual learning, or lifelong learning, is a formidable current challenge to machine learning. It requires the learner to solve a sequence of $k$ different learning tasks, one after the other, while retaining its aptitude for earlier tasks; the continual learner should scale better than the obvious solution of developing and maintaining a separate learner for each of the $k$ tasks. We embark on a complexity-theoretic study of continual learning in the PAC framework. We make novel uses of communication complexity to establish that any continual learner, even an improper one, needs memory that grows linearly with $k$, strongly suggesting that the problem is intractable. When logarithmically many passes over the learning tasks are allowed, we provide an algorithm based on multiplicative weights update whose memory requirement scales well; we also establish that improper learning is necessary for such performance. We conjecture that these results may lead to new promising approaches to continual learning.
△ Less
Submitted 22 April, 2022;
originally announced April 2022.
-
Nash, Conley, and Computation: Impossibility and Incompleteness in Game Dynamics
Authors:
Jason Milionis,
Christos Papadimitriou,
Georgios Piliouras,
Kelly Spendlove
Abstract:
Under what conditions do the behaviors of players, who play a game repeatedly, converge to a Nash equilibrium? If one assumes that the players' behavior is a discrete-time or continuous-time rule whereby the current mixed strategy profile is mapped to the next, this becomes a problem in the theory of dynamical systems. We apply this theory, and in particular the concepts of chain recurrence, attra…
▽ More
Under what conditions do the behaviors of players, who play a game repeatedly, converge to a Nash equilibrium? If one assumes that the players' behavior is a discrete-time or continuous-time rule whereby the current mixed strategy profile is mapped to the next, this becomes a problem in the theory of dynamical systems. We apply this theory, and in particular the concepts of chain recurrence, attractors, and Conley index, to prove a general impossibility result: there exist games for which any dynamics is bound to have starting points that do not end up at a Nash equilibrium. We also prove a stronger result for $ε$-approximate Nash equilibria: there are games such that no game dynamics can converge (in an appropriate sense) to $ε$-Nash equilibria, and in fact the set of such games has positive measure. Further numerical results demonstrate that this holds for any $ε$ between zero and $0.09$. Our results establish that, although the notions of Nash equilibria (and its computation-inspired approximations) are universally applicable in all games, they are also fundamentally incomplete as predictors of long term behavior, regardless of the choice of dynamics.
△ Less
Submitted 26 March, 2022;
originally announced March 2022.
-
Planning with Biological Neurons and Synapses
Authors:
Francesco d'Amore,
Daniel Mitropolsky,
Pierluigi Crescenzi,
Emanuele Natale,
Christos H. Papadimitriou
Abstract:
We revisit the planning problem in the blocks world, and we implement a known heuristic for this task. Importantly, our implementation is biologically plausible, in the sense that it is carried out exclusively through the spiking of neurons. Even though much has been accomplished in the blocks world over the past five decades, we believe that this is the first algorithm of its kind. The input is a…
▽ More
We revisit the planning problem in the blocks world, and we implement a known heuristic for this task. Importantly, our implementation is biologically plausible, in the sense that it is carried out exclusively through the spiking of neurons. Even though much has been accomplished in the blocks world over the past five decades, we believe that this is the first algorithm of its kind. The input is a sequence of symbols encoding an initial set of block stacks as well as a target set, and the output is a sequence of motion commands such as "put the top block in stack 1 on the table". The program is written in the Assembly Calculus, a recently proposed computational framework meant to model computation in the brain by bridging the gap between neural activity and cognitive function. Its elementary objects are assemblies of neurons (stable sets of neurons whose simultaneous firing signifies that the subject is thinking of an object, concept, word, etc.), its commands include project and merge, and its execution model is based on widely accepted tenets of neuroscience. A program in this framework essentially sets up a dynamical system of neurons and synapses that eventually, with high probability, accomplishes the task. The purpose of this work is to establish empirically that reasonably large programs in the Assembly Calculus can execute correctly and reliably; and that rather realistic -- if idealized -- higher cognitive functions, such as planning in the blocks world, can be implemented successfully by such programs.
△ Less
Submitted 16 December, 2021; v1 submitted 15 December, 2021;
originally announced December 2021.
-
Assemblies of neurons learn to classify well-separated distributions
Authors:
Max Dabagia,
Christos H. Papadimitriou,
Santosh S. Vempala
Abstract:
An assembly is a large population of neurons whose synchronous firing is hypothesized to represent a memory, concept, word, and other cognitive categories. Assemblies are believed to provide a bridge between high-level cognitive phenomena and low-level neural activity. Recently, a computational system called the Assembly Calculus (AC), with a repertoire of biologically plausible operations on asse…
▽ More
An assembly is a large population of neurons whose synchronous firing is hypothesized to represent a memory, concept, word, and other cognitive categories. Assemblies are believed to provide a bridge between high-level cognitive phenomena and low-level neural activity. Recently, a computational system called the Assembly Calculus (AC), with a repertoire of biologically plausible operations on assemblies, has been shown capable of simulating arbitrary space-bounded computation, but also of simulating complex cognitive phenomena such as language, reasoning, and planning. However, the mechanism whereby assemblies can mediate learning has not been known. Here we present such a mechanism, and prove rigorously that, for simple classification problems defined on distributions of labeled assemblies, a new assembly representing each class can be reliably formed in response to a few stimuli from the class; this assembly is henceforth reliably recalled in response to new stimuli from the same class. Furthermore, such class assemblies will be distinguishable as long as the respective classes are reasonably separated -- for example, when they are clusters of similar assemblies. To prove these results, we draw on random graph theory with dynamic edge weights to estimate sequences of activated vertices, yielding strong generalizations of previous calculations and theorems in this field over the past five years. These theorems are backed up by experiments demonstrating the successful formation of assemblies which represent concept classes on synthetic data drawn from such distributions, and also on MNIST, which lends itself to classification through one assembly per digit. Seen as a learning algorithm, this mechanism is entirely online, generalizes from very few samples, and requires only mild supervision -- all key attributes of learning in a model of the brain.
△ Less
Submitted 3 July, 2022; v1 submitted 6 October, 2021;
originally announced October 2021.
-
A Biologically Plausible Parser
Authors:
Daniel Mitropolsky,
Michael J. Collins,
Christos H. Papadimitriou
Abstract:
We describe a parser of English effectuated by biologically plausible neurons and synapses, and implemented through the Assembly Calculus, a recently proposed computational framework for cognitive function. We demonstrate that this device is capable of correctly parsing reasonably nontrivial sentences. While our experiments entail rather simple sentences in English, our results suggest that the pa…
▽ More
We describe a parser of English effectuated by biologically plausible neurons and synapses, and implemented through the Assembly Calculus, a recently proposed computational framework for cognitive function. We demonstrate that this device is capable of correctly parsing reasonably nontrivial sentences. While our experiments entail rather simple sentences in English, our results suggest that the parser can be extended beyond what we have implemented, to several directions encompassing much of language. For example, we present a simple Russian version of the parser, and discuss how to handle recursion, embedding, and polysemy.
△ Less
Submitted 4 August, 2021;
originally announced August 2021.
-
Public Goods Games in Directed Networks
Authors:
Christos Papadimitriou,
Binghui Peng
Abstract:
Public goods games in undirected networks are generally known to have pure Nash equilibria, which are easy to find. In contrast, we prove that, in directed networks, a broad range of public goods games have intractable equilibrium problems: The existence of pure Nash equilibria is NP-hard to decide, and mixed Nash equilibria are PPAD-hard to find. We define general utility public goods games, and…
▽ More
Public goods games in undirected networks are generally known to have pure Nash equilibria, which are easy to find. In contrast, we prove that, in directed networks, a broad range of public goods games have intractable equilibrium problems: The existence of pure Nash equilibria is NP-hard to decide, and mixed Nash equilibria are PPAD-hard to find. We define general utility public goods games, and prove a complexity dichotomy result for finding pure equilibria, and a PPAD-completeness proof for mixed Nash equilibria. Even in the divisible goods variant of the problem, where existence is easy to prove, finding the equilibrium is PPAD-complete. Finally, when the treewidth of the directed network is appropriately bounded, we prove that polynomial-time algorithms are possible.
△ Less
Submitted 14 July, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Self-Attention Networks Can Process Bounded Hierarchical Languages
Authors:
Shunyu Yao,
Binghui Peng,
Christos Papadimitriou,
Karthik Narasimhan
Abstract:
Despite their impressive performance in NLP, self-attention networks were recently proved to be limited for processing formal languages with hierarchical structure, such as $\mathsf{Dyck}_k$, the language consisting of well-nested parentheses of $k$ types. This suggested that natural language can be approximated well with models that are too weak for formal languages, or that the role of hierarchy…
▽ More
Despite their impressive performance in NLP, self-attention networks were recently proved to be limited for processing formal languages with hierarchical structure, such as $\mathsf{Dyck}_k$, the language consisting of well-nested parentheses of $k$ types. This suggested that natural language can be approximated well with models that are too weak for formal languages, or that the role of hierarchy and recursion in natural language might be limited. We qualify this implication by proving that self-attention networks can process $\mathsf{Dyck}_{k, D}$, the subset of $\mathsf{Dyck}_{k}$ with depth bounded by $D$, which arguably better captures the bounded hierarchical structure of natural language. Specifically, we construct a hard-attention network with $D+1$ layers and $O(\log k)$ memory size (per token per layer) that recognizes $\mathsf{Dyck}_{k, D}$, and a soft-attention network with two layers and $O(\log k)$ memory size that generates $\mathsf{Dyck}_{k, D}$. Experiments show that self-attention networks trained on $\mathsf{Dyck}_{k, D}$ generalize to longer inputs with near-perfect accuracy, and also verify the theoretical memory advantage of self-attention networks over recurrent networks.
△ Less
Submitted 12 March, 2023; v1 submitted 24 May, 2021;
originally announced May 2021.
-
Online Stochastic Max-Weight Bipartite Matching: Beyond Prophet Inequalities
Authors:
Christos Papadimitriou,
Tristan Pollner,
Amin Saberi,
David Wajc
Abstract:
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows the future. An equally-natural, though significantly less well-studied benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally-unbounded), but not omniscient. What is the comput…
▽ More
The rich literature on online Bayesian selection problems has long focused on so-called prophet inequalities, which compare the gain of an online algorithm to that of a "prophet" who knows the future. An equally-natural, though significantly less well-studied benchmark is the optimum online algorithm, which may be omnipotent (i.e., computationally-unbounded), but not omniscient. What is the computational complexity of the optimum online? How well can a polynomial-time algorithm approximate it?
We study the above questions for the online stochastic maximum-weight matching problem under vertex arrivals. For this problem, a number of $1/2$-competitive algorithms are known against optimum offline. This is the best possible ratio for this problem, as it generalizes the original single-item prophet inequality problem.
We present a polynomial-time algorithm which approximates the optimal online algorithm within a factor of $0.51$ -- beating the best-possible prophet inequality. In contrast, we show that it is PSPACE-hard to approximate this problem within some constant $α< 1$.
△ Less
Submitted 18 August, 2021; v1 submitted 20 February, 2021;
originally announced February 2021.
-
The Platform Design Problem
Authors:
Christos Papadimitriou,
Kiran Vodrahalli,
Mihalis Yannakakis
Abstract:
On-line firms deploy suites of software platforms, where each platform is designed to interact with users during a certain activity, such as browsing, chatting, socializing, emailing, driving, etc. The economic and incentive structure of this exchange, as well as its algorithmic nature, have not been explored to our knowledge. We model this interaction as a Stackelberg game between a Designer and…
▽ More
On-line firms deploy suites of software platforms, where each platform is designed to interact with users during a certain activity, such as browsing, chatting, socializing, emailing, driving, etc. The economic and incentive structure of this exchange, as well as its algorithmic nature, have not been explored to our knowledge. We model this interaction as a Stackelberg game between a Designer and one or more Agents. We model an Agent as a Markov chain whose states are activities; we assume that the Agent's utility is a linear function of the steady-state distribution of this chain. The Designer may design a platform for each of these activities/states; if a platform is adopted by the Agent, the transition probabilities of the Markov chain are affected, and so is the objective of the Agent. The Designer's utility is a linear function of the steady state probabilities of the accessible states minus the development cost of the platforms. The underlying optimization problem of the Agent -- how to choose the states for which to adopt the platform -- is an MDP. If this MDP has a simple yet plausible structure (the transition probabilities from one state to another only depend on the target state and the recurrent probability of the current state) the Agent's problem can be solved by a greedy algorithm. The Designer's optimization problem (designing a custom suite for the Agent so as to optimize, through the Agent's optimum reaction, the Designer's revenue), is NP-hard to approximate within any finite ratio; however, the special case, while still NP-hard, has an FPTAS. These results generalize from a single Agent to a distribution of Agents with finite support, as well as to the setting where the Designer must find the best response to the existing strategies of other Designers. We discuss other implications of our results and directions of future research.
△ Less
Submitted 12 July, 2021; v1 submitted 13 September, 2020;
originally announced September 2020.
-
On the Potential of Dynamic Substructuring Methods for Model Updating
Authors:
Thomas Simpson,
Vasilis Dertimanis,
Costas Papadimitriou,
Eleni Chatzi
Abstract:
While purely data-driven assessment is feasible for the first levels of the Structural Health Monitoring (SHM) process, namely damage detection and arguably damage localization, this does not hold true for more advanced processes. The tasks of damage quantification and eventually residual life prognosis are invariably linked to availability of a representation of the system, which bears physical c…
▽ More
While purely data-driven assessment is feasible for the first levels of the Structural Health Monitoring (SHM) process, namely damage detection and arguably damage localization, this does not hold true for more advanced processes. The tasks of damage quantification and eventually residual life prognosis are invariably linked to availability of a representation of the system, which bears physical connotation. In this context, it is often desirable to assimilate data and models, into what is often termed a digital twin of the monitored system.
One common take to such an end lies in exploitation of structural mechanics models, relying on use of Finite Element approximations. proper updating of these models, and their incorporation in an inverse problem setting may allow for damage quantification and localization, as well as more advanced tasks, including reliability analysis and fatigue assessment. However, this may only be achieved by means of repetitive analyses of the forward model, which implies considerable computational toll, when the model used is a detailed FE representation. In tackling this issue, reduced order models can be adopted, which retain the parameterisation and link to the parameters regulating the physical properties, albeit greatly reducing the computational burden.
In this work a detailed FE model of a wind turbine tower is considered, reduced forms of this model are found using both the Craig Bampton and Dual Craig Bampton methods. These reduced order models are then used and compared in a Transitional Markov Chain Monte Carlo procedure to localise and quantify damage which is introduced to the system.
△ Less
Submitted 30 April, 2021; v1 submitted 30 June, 2020;
originally announced June 2020.
-
A New Age of Computing and the Brain
Authors:
Polina Golland,
Jack Gallant,
Greg Hager,
Hanspeter Pfister,
Christos Papadimitriou,
Stefan Schaal,
Joshua T. Vogelstein
Abstract:
The history of computer science and brain sciences are intertwined. In his unfinished manuscript "The Computer and the Brain," von Neumann debates whether or not the brain can be thought of as a computing machine and identifies some of the similarities and differences between natural and artificial computation. Turing, in his 1950 article in Mind, argues that computing devices could ultimately emu…
▽ More
The history of computer science and brain sciences are intertwined. In his unfinished manuscript "The Computer and the Brain," von Neumann debates whether or not the brain can be thought of as a computing machine and identifies some of the similarities and differences between natural and artificial computation. Turing, in his 1950 article in Mind, argues that computing devices could ultimately emulate intelligence, leading to his proposed Turing test. Herbert Simon predicted in 1957 that most psychological theories would take the form of a computer program. In 1976, David Marr proposed that the function of the visual system could be abstracted and studied at computational and algorithmic levels that did not depend on the underlying physical substrate.
In December 2014, a two-day workshop supported by the Computing Community Consortium (CCC) and the National Science Foundation's Computer and Information Science and Engineering Directorate (NSF CISE) was convened in Washington, DC, with the goal of bringing together computer scientists and brain researchers to explore these new opportunities and connections, and develop a new, modern dialogue between the two research communities. Specifically, our objectives were: 1. To articulate a conceptual framework for research at the interface of brain sciences and computing and to identify key problems in this interface, presented in a way that will attract both CISE and brain researchers into this space. 2. To inform and excite researchers within the CISE research community about brain research opportunities and to identify and explain strategic roles they can play in advancing this initiative. 3. To develop new connections, conversations and collaborations between brain sciences and CISE researchers that will lead to highly relevant and competitive proposals, high-impact research, and influential publications.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
An Axiomatic Approach to Block Rewards
Authors:
Xi Chen,
Christos Papadimitriou,
Tim Roughgarden
Abstract:
Proof-of-work blockchains reward each miner for one completed block by an amount that is, in expectation, proportional to the number of hashes the miner contributed to the mining of the block. Is this proportional allocation rule optimal? And in what sense? And what other rules are possible? In particular, what are the desirable properties that any "good" allocation rule should satisfy? To answer…
▽ More
Proof-of-work blockchains reward each miner for one completed block by an amount that is, in expectation, proportional to the number of hashes the miner contributed to the mining of the block. Is this proportional allocation rule optimal? And in what sense? And what other rules are possible? In particular, what are the desirable properties that any "good" allocation rule should satisfy? To answer these questions, we embark on an axiomatic theory of incentives in proof-of-work blockchains at the time scale of a single block. We consider desirable properties of allocation rules including: symmetry; budget balance (weak or strong); sybil-proofness; and various grades of collusion-proofness. We show that Bitcoin's proportional allocation rule is the unique allocation rule satisfying a certain system of properties, but this does not hold for slightly weaker sets of properties, or when the miners are not risk-neutral. We also point out that a rich class of allocation rules can be approximately implemented in a proof-of-work blockchain.
△ Less
Submitted 23 September, 2019;
originally announced September 2019.
-
Tarski's Theorem, Supermodular Games, and the Complexity of Equilibria
Authors:
Kousha Etessami,
Christos Papadimitriou,
Aviad Rubinstein,
Mihalis Yannakakis
Abstract:
The use of monotonicity and Tarski's theorem in existence proofs of equilibria is very widespread in economics, while Tarski's theorem is also often used for similar purposes in the context of verification. However, there has been relatively little in the way of analysis of the complexity of finding the fixed points and equilibria guaranteed by this result. We study a computational formalism based…
▽ More
The use of monotonicity and Tarski's theorem in existence proofs of equilibria is very widespread in economics, while Tarski's theorem is also often used for similar purposes in the context of verification. However, there has been relatively little in the way of analysis of the complexity of finding the fixed points and equilibria guaranteed by this result. We study a computational formalism based on monotone functions on the $d$-dimensional grid with sides of length $N$, and their fixed points, as well as the closely connected subject of supermodular games and their equilibria. It is known that finding some (any) fixed point of a monotone function can be done in time $\log^d N$, and we show it requires at least $\log^2 N$ function evaluations already on the 2-dimensional grid, even for randomized algorithms. We show that the general Tarski problem of finding some fixed point, when the monotone function is given succinctly (by a boolean circuit), is in the class PLS of problems solvable by local search and, rather surprisingly, also in the class PPAD. Finding the greatest or least fixed point guaranteed by Tarski's theorem, however, requires $d\cdot N$ steps, and is NP-hard in the white box model. For supermodular games, we show that finding an equilibrium in such games is essentially computationally equivalent to the Tarski problem, and finding the maximum or minimum equilibrium is similarly harder. Interestingly, two-player supermodular games where the strategy space of one player is one-dimensional can be solved in $O(\log N)$ steps. We also observe that computing (approximating) the value of Condon's (Shapley's) stochastic games reduces to the Tarski problem. An important open problem highlighted by this work is proving a $Ω(\log^d N)$ lower bound for small fixed dimension $d \geq 3$.
△ Less
Submitted 7 September, 2019;
originally announced September 2019.
-
$α$-Rank: Multi-Agent Evaluation by Evolution
Authors:
Shayegan Omidshafiei,
Christos Papadimitriou,
Georgios Piliouras,
Karl Tuyls,
Mark Rowland,
Jean-Baptiste Lespiau,
Wojciech M. Czarnecki,
Marc Lanctot,
Julien Perolat,
Remi Munos
Abstract:
We introduce $α$-Rank, a principled evolutionary dynamics methodology for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical game-theoretic solution concept called Markov-Conley chains (MCCs). The approach leverages continuous- and discrete-time evolutionary dynamical systems applied to empirical games, and scales tractably in the number of…
▽ More
We introduce $α$-Rank, a principled evolutionary dynamics methodology for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical game-theoretic solution concept called Markov-Conley chains (MCCs). The approach leverages continuous- and discrete-time evolutionary dynamical systems applied to empirical games, and scales tractably in the number of agents, the type of interactions, and the type of empirical games (symmetric and asymmetric). Current models are fundamentally limited in one or more of these dimensions and are not guaranteed to converge to the desired game-theoretic solution concept (typically the Nash equilibrium). $α$-Rank provides a ranking over the set of agents under evaluation and provides insights into their strengths, weaknesses, and long-term dynamics. This is a consequence of the links we establish to the MCC solution concept when the underlying evolutionary model's ranking-intensity parameter, $α$, is chosen to be large, which exactly forms the basis of $α$-Rank. In contrast to the Nash equilibrium, which is a static concept based on fixed points, MCCs are a dynamical solution concept based on the Markov chain formalism, Conley's Fundamental Theorem of Dynamical Systems, and the core ingredients of dynamical systems: fixed points, recurrent sets, periodic orbits, and limit cycles. $α$-Rank runs in polynomial time with respect to the total number of pure strategy profiles, whereas computing a Nash equilibrium for a general-sum game is known to be intractable. We introduce proofs that not only provide a unifying perspective of existing continuous- and discrete-time evolutionary evaluation models, but also reveal the formal underpinnings of the $α$-Rank methodology. We empirically validate the method in several domains including AlphaGo, AlphaZero, MuJoCo Soccer, and Poker.
△ Less
Submitted 4 October, 2019; v1 submitted 4 March, 2019;
originally announced March 2019.
-
Optimal Strategies of Blotto Games: Beyond Convexity
Authors:
Soheil Behnezhad,
Avrim Blum,
Mahsa Derakhshan,
MohammadTaghi Hajiaghayi,
Christos H. Papadimitriou,
Saeed Seddighin
Abstract:
The Colonel Blotto game, first introduced by Borel in 1921, is a well-studied game theory classic. Two colonels each have a pool of troops that they divide simultaneously among a set of battlefields. The winner of each battlefield is the colonel who puts more troops in it and the overall utility of each colonel is the sum of weights of the battlefields that s/he wins. Over the past century, the Co…
▽ More
The Colonel Blotto game, first introduced by Borel in 1921, is a well-studied game theory classic. Two colonels each have a pool of troops that they divide simultaneously among a set of battlefields. The winner of each battlefield is the colonel who puts more troops in it and the overall utility of each colonel is the sum of weights of the battlefields that s/he wins. Over the past century, the Colonel Blotto game has found applications in many different forms of competition from advertisements to politics to sports.
Two main objectives have been proposed for this game in the literature: (i) maximizing the guaranteed expected payoff, and (ii) maximizing the probability of obtaining a minimum payoff $u$. The former corresponds to the conventional utility maximization and the latter concerns scenarios such as elections where the candidates' goal is to maximize the probability of getting at least half of the votes (rather than the expected number of votes). In this paper, we consider both of these objectives and show how it is possible to obtain (almost) optimal solutions that have few strategies in their support.
One of the main technical challenges in obtaining bounded support strategies for the Colonel Blotto game is that the solution space becomes non-convex. This prevents us from using convex programming techniques in finding optimal strategies which are essentially the main tools that are used in the literature. However, we show through a set of structural results that the solution space can, interestingly, be partitioned into polynomially many disjoint convex polytopes that can be considered independently. Coupled with a number of other combinatorial observations, this leads to polynomial time approximation schemes for both of the aforementioned objectives.
△ Less
Submitted 14 January, 2019;
originally announced January 2019.
-
Smoothed Analysis of Discrete Tensor Decomposition and Assemblies of Neurons
Authors:
Nima Anari,
Constantinos Daskalakis,
Wolfgang Maass,
Christos H. Papadimitriou,
Amin Saberi,
Santosh Vempala
Abstract:
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons.
Assemblies are large sets of neurons representing…
▽ More
We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons.
Assemblies are large sets of neurons representing specific memories or concepts. The size of the intersection of two assemblies has been shown in experiments to represent the extent to which these memories co-occur or these concepts are related; the phenomenon is called association of assemblies. This suggests that an animal's memory is a complex web of associations, and poses the problem of recovering this representation from cognitive data. Motivated by this problem, we study the following more general question: Can we reconstruct the Venn diagram of a family of sets, given the sizes of their $\ell$-wise intersections? We show that as long as the family of sets is randomly perturbed, it is enough for the number of measurements to be polynomially larger than the number of nonempty regions of the Venn diagram to fully reconstruct the diagram.
△ Less
Submitted 28 October, 2018;
originally announced October 2018.
-
Passive Static Equilibrium with Frictional Contacts and Application to Grasp Stability Analysis
Authors:
Maximilian Haas-Heger,
Christos Papadimitriou,
Mihalis Yannakakis,
Garud Iyengar,
Matei Ciocarlie
Abstract:
This paper studies the problem of passive grasp stability under an external disturbance, that is, the ability of a grasp to resist a disturbance through passive responses at the contacts. To obtain physically consistent results, such a model must account for friction phenomena at each contact; the difficulty is that friction forces depend in non-linear fashion on contact behavior (stick or slip).…
▽ More
This paper studies the problem of passive grasp stability under an external disturbance, that is, the ability of a grasp to resist a disturbance through passive responses at the contacts. To obtain physically consistent results, such a model must account for friction phenomena at each contact; the difficulty is that friction forces depend in non-linear fashion on contact behavior (stick or slip). We develop the first polynomial-time algorithm which either solves such complex equilibrium constraints for two-dimensional grasps, or otherwise concludes that no solution exists. To achieve this, we show that the number of possible `slip states' (where each contact is labeled as either sticking or slipping) that must be considered is polynomial (in fact quadratic) in the number of contacts, and not exponential as previously thought. Our algorithm captures passive response behaviors at each contact, while accounting for constraints on friction forces such as the maximum dissipation principle.
△ Less
Submitted 13 June, 2018; v1 submitted 4 June, 2018;
originally announced June 2018.
-
Wealth Inequality and the Price of Anarchy
Authors:
Kurtuluş Gemici,
Elias Koutsoupias,
Barnabé Monnot,
Christos Papadimitriou,
Georgios Piliouras
Abstract:
Price of anarchy quantifies the degradation of social welfare in games due to the lack of a centralized authority that can enforce the optimal outcome. At its antipodes, mechanism design studies how to ameliorate these effects by incentivizing socially desirable behavior and implementing the optimal state as equilibrium. In practice, the responsiveness to such measures depends on the wealth of eac…
▽ More
Price of anarchy quantifies the degradation of social welfare in games due to the lack of a centralized authority that can enforce the optimal outcome. At its antipodes, mechanism design studies how to ameliorate these effects by incentivizing socially desirable behavior and implementing the optimal state as equilibrium. In practice, the responsiveness to such measures depends on the wealth of each individual. This leads to a natural, but largely unexplored, question. Does optimal mechanism design entrench, or maybe even exacerbate, social inequality?
We study this question in nonatomic congestion games, arguably one of the most thoroughly studied settings from the perspectives of price of anarchy as well as mechanism design. We introduce a new model that incorporates the wealth distribution of the population and captures the income elasticity of travel time. This allows us to argue about the equality of wealth distribution both before and after employing a mechanism. We start our analysis by establishing a broad qualitative result, showing that tolls always increase inequality in symmetric congestion games under any reasonable metric of inequality, e.g., the Gini index. Next, we introduce the iniquity index, a novel measure for quantifying the magnitude of these forces towards a more unbalanced wealth distribution and show it has good normative properties (robustness to scaling of income, no-regret learning). We analyze iniquity both in theoretical settings (Pigou's network under various wealth distributions) as well as experimental ones (based on a large scale field experiment in Singapore). Finally, we provide an algorithm for computing optimal tolls for any point of the trade-off of relative importance of efficiency and equality. We conclude with a discussion of our findings in the context of theories of justice as developed in contemporary social sciences.
△ Less
Submitted 26 February, 2018;
originally announced February 2018.
-
Cycles in adversarial regularized learning
Authors:
Panayotis Mertikopoulos,
Christos Papadimitriou,
Georgios Piliouras
Abstract:
Regularized learning is a fundamental technique in online optimization, machine learning and many other fields of computer science. A natural question that arises in these settings is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this problem by coupling regularized learning dynamics in zero-sum games. We show that the system's behavior…
▽ More
Regularized learning is a fundamental technique in online optimization, machine learning and many other fields of computer science. A natural question that arises in these settings is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this problem by coupling regularized learning dynamics in zero-sum games. We show that the system's behavior is Poincaré recurrent, implying that almost every trajectory revisits any (arbitrarily small) neighborhood of its starting point infinitely often. This cycling behavior is robust to the agents' choice of regularization mechanism (each agent could be using a different regularizer), to positive-affine transformations of the agents' utilities, and it also persists in the case of networked competition, i.e., for zero-sum polymatrix games.
△ Less
Submitted 8 September, 2017;
originally announced September 2017.
-
Power-Law Distributions in a Two-sided Market and Net Neutrality
Authors:
Xiaotie Deng,
Zhe Feng,
Christos H. Papadimitriou
Abstract:
"Net neutrality" often refers to the policy dictating that an Internet service provider (ISP) cannot charge content providers (CPs) for delivering their content to consumers. Many past quantitative models designed to determine whether net neutrality is a good idea have been rather equivocal in their conclusions. Here we propose a very simple two-sided market model, in which the types of the consum…
▽ More
"Net neutrality" often refers to the policy dictating that an Internet service provider (ISP) cannot charge content providers (CPs) for delivering their content to consumers. Many past quantitative models designed to determine whether net neutrality is a good idea have been rather equivocal in their conclusions. Here we propose a very simple two-sided market model, in which the types of the consumers and the CPs are {\em power-law distributed} --- a kind of distribution known to often arise precisely in connection with Internet-related phenomena. We derive mostly analytical, closed-form results for several regimes: (a) Net neutrality, (b) social optimum, (c) maximum revenue by the ISP, or (d) maximum ISP revenue under quality differentiation. One unexpected conclusion is that (a) and (b) will differ significantly, unless average CP productivity is very high.
△ Less
Submitted 15 October, 2016;
originally announced October 2016.
-
On the optimality of grid cells
Authors:
Christos H. Papadimitriou
Abstract:
Grid cells, discovered more than a decade ago [5], are neurons in the brain of mammals that fire when the animal is located near certain specific points in its familiar terrain. Intriguingly, these points form, for a single cell, a two-dimensional triangular grid, not unlike our Figure 3. Grid cells are widely believed to be involved in path integration, that is, the maintenance of a location stat…
▽ More
Grid cells, discovered more than a decade ago [5], are neurons in the brain of mammals that fire when the animal is located near certain specific points in its familiar terrain. Intriguingly, these points form, for a single cell, a two-dimensional triangular grid, not unlike our Figure 3. Grid cells are widely believed to be involved in path integration, that is, the maintenance of a location state through the summation of small displacements. We provide theoretical evidence for this assertion by showing that cells with grid-like tuning curves are indeed well adapted for the path integration task. In particular we prove that, in one dimension under Gaussian noise, the sensitivity of measuring small displacements is maximized by a population of neurons whose tuning curves are near-sinusoids -- that is to say, with peaks forming a one-dimensional grid. We also show that effective computation of the displacement is possible through a second population of cells whose sinusoid tuning curves are in phase difference from the first. In two dimensions, under additional assumptions it can be shown that measurement sensitivity is optimized by the product of two sinusoids, again yielding a grid-like pattern. We discuss the connection of our results to the triangular grid pattern observed in animals.
△ Less
Submitted 15 June, 2016;
originally announced June 2016.
-
On Satisfiability Problems with a Linear Structure
Authors:
Serge Gaspers,
Christos Papadimitriou,
Sigve Hortemo Saether,
Jan Arne Telle
Abstract:
It was recently shown \cite{STV} that satisfiability is polynomially solvable when the incidence graph is an interval bipartite graph (an interval graph turned into a bipartite graph by omitting all edges within each partite set). Here we relax this condition in several directions: First, we show that it holds for $k$-interval bigraphs, bipartite graphs which can be converted to interval bipartite…
▽ More
It was recently shown \cite{STV} that satisfiability is polynomially solvable when the incidence graph is an interval bipartite graph (an interval graph turned into a bipartite graph by omitting all edges within each partite set). Here we relax this condition in several directions: First, we show that it holds for $k$-interval bigraphs, bipartite graphs which can be converted to interval bipartite graphs by adding to each node of one side at most $k$ edges; the same result holds for the counting and the weighted maximization version of satisfiability. Second, given two linear orders, one for the variables and one for the clauses, we show how to find, in polynomial time, the smallest $k$ such that there is a $k$-interval bigraph compatible with these two orders. On the negative side we prove that, barring complexity collapses, no such extensions are possible for CSPs more general than satisfiability. We also show NP-hardness of recognizing 1-interval bigraphs.
△ Less
Submitted 25 February, 2016;
originally announced February 2016.
-
On the Computational Complexity of Limit Cycles in Dynamical Systems
Authors:
Christos H. Papadimitriou,
Nisheeth K. Vishnoi
Abstract:
We study the Poincare-Bendixson theorem for two-dimensional continuous dynamical systems in compact domains from the point of view of computation, seeking algorithms for finding the limit cycle promised by this classical result. We start by considering a discrete analogue of this theorem and show that both finding a point on a limit cycle, and determining if a given point is on one, are PSPACE-com…
▽ More
We study the Poincare-Bendixson theorem for two-dimensional continuous dynamical systems in compact domains from the point of view of computation, seeking algorithms for finding the limit cycle promised by this classical result. We start by considering a discrete analogue of this theorem and show that both finding a point on a limit cycle, and determining if a given point is on one, are PSPACE-complete.
For the continuous version, we show that both problems are uncomputable in the real complexity sense; i.e., their complexity is arbitrarily high. Subsequently, we introduce a notion of an "approximate cycle" and prove an "approximate" Poincaré-Bendixson theorem guaranteeing that some orbits come very close to forming a cycle in the absence of approximate fixpoints; surprisingly, it holds for all dimensions. The corresponding computational problem defined in terms of arithmetic circuits is PSPACE-complete.
△ Less
Submitted 24 November, 2015;
originally announced November 2015.
-
Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular Functions
Authors:
Ashwinkumar Badanidiyuru,
Christos Papadimitriou,
Aviad Rubinstein,
Lior Seeman,
Yaron Singer
Abstract:
The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choi…
▽ More
The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize.
Our main result is a $(1-1/e)^2$-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call \emph{locally-adaptive} policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the $(1-1/e)^2$-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies.
We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most $k$ that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from \textsc{Planted-Clique} that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of \emph{locally-adaptive} policies we use in the main result.
△ Less
Submitted 8 July, 2015;
originally announced July 2015.
-
Strategic Classification
Authors:
Moritz Hardt,
Nimrod Megiddo,
Christos Papadimitriou,
Mary Wootters
Abstract:
Machine learning relies on the assumption that unseen test instances of a classification problem follow the same distribution as observed training data. However, this principle can break down when machine learning is used to make important decisions about the welfare (employment, education, health) of strategic individuals. Knowing information about the classifier, such individuals may manipulate…
▽ More
Machine learning relies on the assumption that unseen test instances of a classification problem follow the same distribution as observed training data. However, this principle can break down when machine learning is used to make important decisions about the welfare (employment, education, health) of strategic individuals. Knowing information about the classifier, such individuals may manipulate their attributes in order to obtain a better classification outcome. As a result of this behavior---often referred to as gaming---the performance of the classifier may deteriorate sharply. Indeed, gaming is a well-known obstacle for using machine learning methods in practice; in financial policy-making, the problem is widely known as Goodhart's law. In this paper, we formalize the problem, and pursue algorithms for learning classifiers that are robust to gaming.
We model classification as a sequential game between a player named "Jury" and a player named "Contestant." Jury designs a classifier, and Contestant receives an input to the classifier, which he may change at some cost. Jury's goal is to achieve high classification accuracy with respect to Contestant's original input and some underlying target classification function. Contestant's goal is to achieve a favorable classification outcome while taking into account the cost of achieving it.
For a natural class of cost functions, we obtain computationally efficient learning algorithms which are near-optimal. Surprisingly, our algorithms are efficient even on concept classes that are computationally hard to learn. For general cost functions, designing an approximately optimal strategy-proof classifier, for inverse-polynomial approximation, is NP-hard.
△ Less
Submitted 22 November, 2015; v1 submitted 23 June, 2015;
originally announced June 2015.
-
Can Almost Everybody be Almost Happy? PCP for PPAD and the Inapproximability of Nash
Authors:
Yakov Babichenko,
Christos Papadimitriou,
Aviad Rubinstein
Abstract:
We conjecture that PPAD has a PCP-like complete problem, seeking a near equilibrium in which all but very few players have very little incentive to deviate. We show that, if one assumes that this problem requires exponential time, several open problems in this area are settled. The most important implication, proved via a "birthday repetition" reduction, is that the n^O(log n) approximation scheme…
▽ More
We conjecture that PPAD has a PCP-like complete problem, seeking a near equilibrium in which all but very few players have very little incentive to deviate. We show that, if one assumes that this problem requires exponential time, several open problems in this area are settled. The most important implication, proved via a "birthday repetition" reduction, is that the n^O(log n) approximation scheme of [LMM03] for the Nash equilibrium of two-player games is essentially optimum. Two other open problems in the area are resolved once one assumes this conjecture, establishing that certain approximate equilibria are PPAD-complete: Finding a relative approximation of two-player Nash equilibria (without the well-supported restriction of [Das13]), and an approximate competitive equilibrium with equal incomes [Bud11] with small clearing error and near-optimal Gini coefficient.
△ Less
Submitted 8 June, 2015; v1 submitted 9 April, 2015;
originally announced April 2015.
-
Unsupervised Learning through Prediction in a Model of Cortex
Authors:
Christos H. Papadimitriou,
Santosh S. Vempala
Abstract:
We propose a primitive called PJOIN, for "predictive join," which combines and extends the operations JOIN and LINK, which Valiant proposed as the basis of a computational theory of cortex. We show that PJOIN can be implemented in Valiant's model. We also show that, using PJOIN, certain reasonably complex learning and pattern matching tasks can be performed, in a way that involves phenomena which…
▽ More
We propose a primitive called PJOIN, for "predictive join," which combines and extends the operations JOIN and LINK, which Valiant proposed as the basis of a computational theory of cortex. We show that PJOIN can be implemented in Valiant's model. We also show that, using PJOIN, certain reasonably complex learning and pattern matching tasks can be performed, in a way that involves phenomena which have been observed in cognition and the brain, namely memory-based prediction and downward traffic in the cortical hierarchy.
△ Less
Submitted 26 December, 2014;
originally announced December 2014.
-
Optimum Statistical Estimation with Strategic Data Sources
Authors:
Yang Cai,
Constantinos Daskalakis,
Christos H. Papadimitriou
Abstract:
We propose an optimum mechanism for providing monetary incentives to the data sources of a statistical estimator such as linear regression, so that high quality data is provided at low cost, in the sense that the sum of payments and estimation error is minimized. The mechanism applies to a broad range of estimators, including linear and polynomial regression, kernel regression, and, under some add…
▽ More
We propose an optimum mechanism for providing monetary incentives to the data sources of a statistical estimator such as linear regression, so that high quality data is provided at low cost, in the sense that the sum of payments and estimation error is minimized. The mechanism applies to a broad range of estimators, including linear and polynomial regression, kernel regression, and, under some additional assumptions, ridge regression. It also generalizes to several objectives, including minimizing estimation error subject to budget constraints. Besides our concrete results for regression problems, we contribute a mechanism design framework through which to design and analyze statistical estimators whose examples are supplied by workers with cost for labeling said examples.
△ Less
Submitted 24 April, 2015; v1 submitted 11 August, 2014;
originally announced August 2014.
-
On the Complexity of Dynamic Mechanism Design
Authors:
Christos Papadimitriou,
George Pierrakos,
Christos-Alexandros Psomas,
Aviad Rubinstein
Abstract:
We introduce a dynamic mechanism design problem in which the designer wants to offer for sale an item to an agent, and another item to the same agent at some point in the future. The agent's joint distribution of valuations for the two items is known, and the agent knows the valuation for the current item (but not for the one in the future). The designer seeks to maximize expected revenue, and the…
▽ More
We introduce a dynamic mechanism design problem in which the designer wants to offer for sale an item to an agent, and another item to the same agent at some point in the future. The agent's joint distribution of valuations for the two items is known, and the agent knows the valuation for the current item (but not for the one in the future). The designer seeks to maximize expected revenue, and the auction must be deterministic, truthful, and ex post individually rational. The optimum mechanism involves a protocol whereby the seller elicits the buyer's current valuation, and based on the bid makes two take-it-or-leave-it offers, one for now and one for the future. We show that finding the optimum deterministic mechanism in this situation - arguably the simplest meaningful dynamic mechanism design problem imaginable - is NP-hard. We also prove several positive results, among them a polynomial linear programming-based algorithm for the optimum randomized auction (even for many bidders and periods), and we show strong separations in revenue between non-adaptive, adaptive, and randomized auctions, even when the valuations in the two periods are uncorrelated. Finally, for the same problem in an environment in which contracts cannot be enforced, and thus perfection of equilibrium is necessary, we show that the optimum randomized mechanism requires multiple rounds of cheap talk-like interactions.
△ Less
Submitted 18 May, 2023; v1 submitted 21 July, 2014;
originally announced July 2014.
-
On Simplex Pivoting Rules and Complexity Theory
Authors:
Ilan Adler,
Christos Papadimitriou,
Aviad Rubinstein
Abstract:
We show that there are simplex pivoting rules for which it is PSPACE-complete to tell if a particular basis will appear on the algorithm's path. Such rules cannot be the basis of a strongly polynomial algorithm, unless P = PSPACE. We conjecture that the same can be shown for most known variants of the simplex method. However, we also point out that Dantzig's shadow vertex algorithm has a polynomia…
▽ More
We show that there are simplex pivoting rules for which it is PSPACE-complete to tell if a particular basis will appear on the algorithm's path. Such rules cannot be the basis of a strongly polynomial algorithm, unless P = PSPACE. We conjecture that the same can be shown for most known variants of the simplex method. However, we also point out that Dantzig's shadow vertex algorithm has a polynomial path problem. Finally, we discuss in the same context randomized pivoting rules.
△ Less
Submitted 12 April, 2014;
originally announced April 2014.
-
Word-length entropies and correlations of natural language written texts
Authors:
Maria Kalimeri,
Vassilios Constantoudis,
Constantinos Papadimitriou,
Konstantinos Karamanos,
Fotis K. Diakonos,
Harris Papageorgiou
Abstract:
We study the frequency distributions and correlations of the word lengths of ten European languages. Our findings indicate that a) the word-length distribution of short words quantified by the mean value and the entropy distinguishes the Uralic (Finnish) corpus from the others, b) the tails at long words, manifested in the high-order moments of the distributions, differentiate the Germanic languag…
▽ More
We study the frequency distributions and correlations of the word lengths of ten European languages. Our findings indicate that a) the word-length distribution of short words quantified by the mean value and the entropy distinguishes the Uralic (Finnish) corpus from the others, b) the tails at long words, manifested in the high-order moments of the distributions, differentiate the Germanic languages (except for English) from the Romanic languages and Greek and c) the correlations between nearby word lengths measured by the comparison of the real entropies with those of the shuffled texts are found to be smaller in the case of Germanic and Finnish languages.
△ Less
Submitted 23 January, 2014;
originally announced January 2014.
-
Entropy analysis of word-length series of natural language texts: Effects of text language and genre
Authors:
Maria Kalimeri,
Vassilios Constantoudis,
Constantinos Papadimitriou,
Kostantinos Karamanos,
Fotis K. Diakonos,
Haris Papageorgiou
Abstract:
We estimate the $n$-gram entropies of natural language texts in word-length representation and find that these are sensitive to text language and genre. We attribute this sensitivity to changes in the probability distribution of the lengths of single words and emphasize the crucial role of the uniformity of probabilities of having words with length between five and ten. Furthermore, comparison wit…
▽ More
We estimate the $n$-gram entropies of natural language texts in word-length representation and find that these are sensitive to text language and genre. We attribute this sensitivity to changes in the probability distribution of the lengths of single words and emphasize the crucial role of the uniformity of probabilities of having words with length between five and ten. Furthermore, comparison with the entropies of shuffled data reveals the impact of word length correlations on the estimated $n$-gram entropies.
△ Less
Submitted 16 January, 2014;
originally announced January 2014.
-
The Complexity of Fairness through Equilibrium
Authors:
Abraham Othman,
Christos Papadimitriou,
Aviad Rubinstein
Abstract:
Competitive equilibrium with equal incomes (CEEI) is a well known fair allocation mechanism; however, for indivisible resources a CEEI may not exist. It was shown in [Budish '11] that in the case of indivisible resources there is always an allocation, called A-CEEI, that is approximately fair, approximately truthful, and approximately efficient, for some favorable approximation parameters. This ap…
▽ More
Competitive equilibrium with equal incomes (CEEI) is a well known fair allocation mechanism; however, for indivisible resources a CEEI may not exist. It was shown in [Budish '11] that in the case of indivisible resources there is always an allocation, called A-CEEI, that is approximately fair, approximately truthful, and approximately efficient, for some favorable approximation parameters. This approximation is used in practice to assign students to classes. In this paper we show that finding the A-CEEI allocation guaranteed to exist by Budish's theorem is PPAD-complete. We further show that finding an approximate equilibrium with better approximation guarantees is even harder: NP-complete.
△ Less
Submitted 30 September, 2014; v1 submitted 21 December, 2013;
originally announced December 2013.
-
Satisfiability and Evolution
Authors:
Adi Livnat,
Christos Papadimitriou,
Aviad Rubinstein,
Gregory Valiant,
Andrew Wan
Abstract:
We show that, if truth assignments on $n$ variables reproduce through recombination so that satisfaction of a particular Boolean function confers a small evolutionary advantage, then a polynomially large population over polynomially many generations (polynomial in $n$ and the inverse of the initial satisfaction probability) will end up almost certainly consisting exclusively of satisfying truth as…
▽ More
We show that, if truth assignments on $n$ variables reproduce through recombination so that satisfaction of a particular Boolean function confers a small evolutionary advantage, then a polynomially large population over polynomially many generations (polynomial in $n$ and the inverse of the initial satisfaction probability) will end up almost certainly consisting exclusively of satisfying truth assignments. We argue that this theorem sheds light on the problem of novelty in Evolution.
△ Less
Submitted 11 August, 2014; v1 submitted 6 December, 2013;
originally announced December 2013.
-
Sparse Covers for Sums of Indicators
Authors:
Constantinos Daskalakis,
Christos Papadimitriou
Abstract:
For all $n, ε>0$, we show that the set of Poisson Binomial distributions on $n$ variables admits a proper $ε$-cover in total variation distance of size $n^2+n \cdot (1/ε)^{O(\log^2 (1/ε))}$, which can also be computed in polynomial time. We discuss the implications of our construction for approximation algorithms and the computation of approximate Nash equilibria in anonymous games.
For all $n, ε>0$, we show that the set of Poisson Binomial distributions on $n$ variables admits a proper $ε$-cover in total variation distance of size $n^2+n \cdot (1/ε)^{O(\log^2 (1/ε))}$, which can also be computed in polynomial time. We discuss the implications of our construction for approximation algorithms and the computation of approximate Nash equilibria in anonymous games.
△ Less
Submitted 1 October, 2014; v1 submitted 5 June, 2013;
originally announced June 2013.
-
Learning and Verifying Quantified Boolean Queries by Example
Authors:
Azza Abouzied,
Dana Angluin,
Christos Papadimitriou,
Joseph M. Hellerstein,
Avi Silberschatz
Abstract:
To help a user specify and verify quantified queries --- a class of database queries known to be very challenging for all but the most expert users --- one can question the user on whether certain data objects are answers or non-answers to her intended query. In this paper, we analyze the number of questions needed to learn or verify qhorn queries, a special class of Boolean quantified queries who…
▽ More
To help a user specify and verify quantified queries --- a class of database queries known to be very challenging for all but the most expert users --- one can question the user on whether certain data objects are answers or non-answers to her intended query. In this paper, we analyze the number of questions needed to learn or verify qhorn queries, a special class of Boolean quantified queries whose underlying form is conjunctions of quantified Horn expressions. We provide optimal polynomial-question and polynomial-time learning and verification algorithms for two subclasses of the class qhorn with upper constant limits on a query's causal density.
△ Less
Submitted 15 April, 2013;
originally announced April 2013.