Search | arXiv e-print repository

doi 10.1038/s42256-023-00754-x

A social path to human-like artificial intelligence

Authors: Edgar A. Duéñez-Guzmán, Suzanne Sadedin, Jane X. Wang, Kevin R. McKee, Joel Z. Leibo

Abstract: Traditionally, cognitive and computer scientists have viewed intelligence solipsistically, as a property of unitary agents devoid of social context. Given the success of contemporary learning algorithms, we argue that the bottleneck in artificial intelligence (AI) progress is shifting from data assimilation to novel data generation. We bring together evidence showing that natural intelligence emer… ▽ More Traditionally, cognitive and computer scientists have viewed intelligence solipsistically, as a property of unitary agents devoid of social context. Given the success of contemporary learning algorithms, we argue that the bottleneck in artificial intelligence (AI) progress is shifting from data assimilation to novel data generation. We bring together evidence showing that natural intelligence emerges at multiple scales in networks of interacting agents via collective living, social relationships and major evolutionary transitions, which contribute to novel data generation through mechanisms such as population pressures, arms races, Machiavellian selection, social learning and cumulative culture. Many breakthroughs in AI exploit some of these processes, from multi-agent structures enabling algorithms to master complex games like Capture-The-Flag and StarCraft II, to strategic communication in Diplomacy and the shaping of AI data streams by other AIs. Moving beyond a solipsistic view of agency to integrate these mechanisms suggests a path to human-like compounding innovation through ongoing novel data generation. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 17 pages, 2 figures, 1 box

MSC Class: 68T05 ACM Class: I.2.6

arXiv:2404.10179 [pdf, other]

Scaling Instructable Agents Across Many Simulated Worlds

Authors: SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi , et al. (68 additional authors not shown)

Abstract: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructio… ▽ More Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructions across a diverse range of virtual 3D environments, including curated research environments as well as open-ended, commercial video games. Our goal is to develop an instructable agent that can accomplish anything a human can do in any simulated 3D environment. Our approach focuses on language-driven generality while imposing minimal assumptions. Our agents interact with environments in real-time using a generic, human-like interface: the inputs are image observations and language instructions and the outputs are keyboard-and-mouse actions. This general approach is challenging, but it allows agents to ground language across many visually complex and semantically rich environments while also allowing us to readily run agents in new environments. In this paper we describe our motivation and goal, the initial progress we have made, and promising preliminary results on several diverse research environments and a variety of commercial video games. △ Less

Submitted 17 April, 2024; v1 submitted 13 March, 2024; originally announced April 2024.

arXiv:2402.18225 [pdf, other]

CogBench: a large language model walks into a psychology lab

Authors: Julian Coda-Forno, Marcel Binz, Jane X. Wang, Eric Schulz

Abstract: Large language models (LLMs) have significantly advanced the field of artificial intelligence. Yet, evaluating them comprehensively remains challenging. We argue that this is partly due to the predominant focus on performance metrics in most benchmarks. This paper introduces CogBench, a benchmark that includes ten behavioral metrics derived from seven cognitive psychology experiments. This novel a… ▽ More Large language models (LLMs) have significantly advanced the field of artificial intelligence. Yet, evaluating them comprehensively remains challenging. We argue that this is partly due to the predominant focus on performance metrics in most benchmarks. This paper introduces CogBench, a benchmark that includes ten behavioral metrics derived from seven cognitive psychology experiments. This novel approach offers a toolkit for phenotyping LLMs' behavior. We apply CogBench to 35 LLMs, yielding a rich and diverse dataset. We analyze this data using statistical multilevel modeling techniques, accounting for the nested dependencies among fine-tuned versions of specific LLMs. Our study highlights the crucial role of model size and reinforcement learning from human feedback (RLHF) in improving performance and aligning with human behavior. Interestingly, we find that open-source models are less risk-prone than proprietary models and that fine-tuning on code does not necessarily enhance LLMs' behavior. Finally, we explore the effects of prompt-engineering techniques. We discover that chain-of-thought prompting improves probabilistic reasoning, while take-a-step-back prompting fosters model-based behaviors. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2305.16183 [pdf, other]

Passive learning of active causal strategies in agents and language models

Authors: Andrew Kyle Lampinen, Stephanie C Y Chan, Ishita Dasgupta, Andrew J Nam, Jane X Wang

Abstract: What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long… ▽ More What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long as the agent can intervene at test time. We formally illustrate that learning a strategy of first experimenting, then seeking goals, can allow generalization from passive learning in principle. We then show empirically that agents trained via imitation on expert data can indeed generalize at test time to infer and use causal links which are never present in the training data; these agents can also generalize experimentation strategies to novel variable sets never observed in training. We then show that strategies for causal intervention and exploitation can be generalized from passive data even in a more complex environment with high-dimensional observations, with the support of natural language explanations. Explanations can even allow passive learners to generalize out-of-distribution from perfectly-confounded training data. Finally, we show that language models, trained only on passive next-word prediction, can generalize causal intervention strategies from a few-shot prompt containing examples of experimentation, together with explanations and reasoning. These results highlight the surprising power of passive learning of active causal strategies, and may help to understand the behaviors and capabilities of language models. △ Less

Submitted 2 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: Advances in Neural Information Processing Systems (NeurIPS 2023). 10 pages main text

arXiv:2305.12907 [pdf, other]

Meta-in-context learning in large language models

Authors: Julian Coda-Forno, Marcel Binz, Zeynep Akata, Matthew Botvinick, Jane X. Wang, Eric Schulz

Abstract: Large language models have shown tremendous performance in a variety of tasks. In-context learning -- the ability to improve at a task after being provided with a number of demonstrations -- is seen as one of the main contributors to their success. In the present paper, we demonstrate that the in-context learning abilities of large language models can be recursively improved via in-context learnin… ▽ More Large language models have shown tremendous performance in a variety of tasks. In-context learning -- the ability to improve at a task after being provided with a number of demonstrations -- is seen as one of the main contributors to their success. In the present paper, we demonstrate that the in-context learning abilities of large language models can be recursively improved via in-context learning itself. We coin this phenomenon meta-in-context learning. Looking at two idealized domains, a one-dimensional regression task and a two-armed bandit task, we show that meta-in-context learning adaptively reshapes a large language model's priors over expected tasks. Furthermore, we find that meta-in-context learning modifies the in-context learning strategies of such models. Finally, we extend our approach to a benchmark of real-world regression problems where we observe competitive performance to traditional learning algorithms. Taken together, our work improves our understanding of in-context learning and paves the way toward adapting large language models to the environment they are applied purely through meta-in-context learning rather than traditional finetuning. △ Less

Submitted 22 May, 2023; originally announced May 2023.

arXiv:2304.06729 [pdf, other]

Meta-Learned Models of Cognition

Authors: Marcel Binz, Ishita Dasgupta, Akshay Jagadish, Matthew Botvinick, Jane X. Wang, Eric Schulz

Abstract: Meta-learning is a framework for learning learning algorithms through repeated interactions with an environment as opposed to designing them by hand. In recent years, this framework has established itself as a promising tool for building models of human cognition. Yet, a coherent research program around meta-learned models of cognition is still missing. The purpose of this article is to synthesize… ▽ More Meta-learning is a framework for learning learning algorithms through repeated interactions with an environment as opposed to designing them by hand. In recent years, this framework has established itself as a promising tool for building models of human cognition. Yet, a coherent research program around meta-learned models of cognition is still missing. The purpose of this article is to synthesize previous work in this field and establish such a research program. We rely on three key pillars to accomplish this goal. We first point out that meta-learning can be used to construct Bayes-optimal learning algorithms. This result not only implies that any behavioral phenomenon that can be explained by a Bayesian model can also be explained by a meta-learned model but also allows us to draw strong connections to the rational analysis of cognition. We then discuss several advantages of the meta-learning framework over traditional Bayesian methods. In particular, we argue that meta-learning can be applied to situations where Bayesian inference is impossible and that it enables us to make rational models of cognition more realistic, either by incorporating limited computational resources or neuroscientific knowledge. Finally, we reexamine prior studies from psychology and neuroscience that have applied meta-learning and put them into the context of these new insights. In summary, our work highlights that meta-learning considerably extends the scope of rational analysis and thereby of cognitive theories more generally. △ Less

Submitted 12 April, 2023; originally announced April 2023.

arXiv:2205.05055 [pdf, other]

Data Distributional Properties Drive Emergent In-Context Learning in Transformers

Authors: Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill

Abstract: Large transformer-based models are able to perform in-context few-shot learning, without being explicitly trained for it. This observation raises the question: what aspects of the training regime lead to this emergent behavior? Here, we show that this behavior is driven by the distributions of the training data itself. In-context learning emerges when the training data exhibits particular distribu… ▽ More Large transformer-based models are able to perform in-context few-shot learning, without being explicitly trained for it. This observation raises the question: what aspects of the training regime lead to this emergent behavior? Here, we show that this behavior is driven by the distributions of the training data itself. In-context learning emerges when the training data exhibits particular distributional properties such as burstiness (items appear in clusters rather than being uniformly distributed over time) and having large numbers of rarely occurring classes. In-context learning also emerges more strongly when item meanings or interpretations are dynamic rather than fixed. These properties are exemplified by natural language, but are also inherent to naturalistic data in a wide range of other domains. They also depart significantly from the uniform, i.i.d. training distributions typically used for standard supervised learning. In our initial experiments, we found that in-context learning traded off against more conventional weight-based learning, and models were unable to achieve both simultaneously. However, our later experiments uncovered that the two modes of learning could co-exist in a single model when it was trained on data following a skewed Zipfian distribution -- another common property of naturalistic data, including language. In further experiments, we found that naturalistic data distributions were only able to elicit in-context learning in transformers, and not in recurrent models. In sum, our findings indicate how the transformer architecture works together with particular properties of the training data to drive the intriguing emergent in-context learning behaviour of large language models, and how future work might encourage both in-context and in-weights learning in domains beyond language. △ Less

Submitted 17 November, 2022; v1 submitted 22 April, 2022; originally announced May 2022.

Comments: Accepted at NeurIPS 2022 (Oral). Code is available at: https://github.com/deepmind/emergent_in_context_learning

arXiv:2205.04757 [pdf, other]

doi 10.1051/0004-6361/202243268

Characterization of Kepler targets based on medium-resolution LAMOST spectra analyzed with ROTFIT

Authors: A. Frasca, J. Molenda-Zakowicz, J. Alonso-Santiago, G. Catanzaro, P. De Cat, J. N. Fu, W. Zong, J. X. Wang, T. Cang, J. T. Wang

Abstract: In this work we present the results of our analysis of 16,300 medium-resolution LAMOST spectra of late-type stars in the Kepler field with the aim of determining the stellar parameters, activity level, lithium atmospheric content, and binarity. We have used a version of the code ROTFIT specifically developed for these spectra. We provide a catalog with the atmospheric parameters (Teff, log(g), and… ▽ More In this work we present the results of our analysis of 16,300 medium-resolution LAMOST spectra of late-type stars in the Kepler field with the aim of determining the stellar parameters, activity level, lithium atmospheric content, and binarity. We have used a version of the code ROTFIT specifically developed for these spectra. We provide a catalog with the atmospheric parameters (Teff, log(g), and [Fe/H]), radial velocity (RV), and projected rotation velocity (vsini). For cool stars (Teff < 6500 K), we also calculated the H-alpha and LiI-6708 equivalent width, which are important indicators of chromospheric activity and evolutionary stage, respectively. We have derived the RV and atmospheric parameters for 14,300 spectra of 7443 stars. Literature data were used for a quality control of the results. The Teff and log(g) values are in good agreement with the literature. The [Fe/H] values appear to be overestimated for metal-poor stars. We propose a relation to correct the [Fe/H] values derived with ROTFIT. We were able to identify double-lined binaries, stars with variable RVs, lithium-rich giants, and emission-line objects. Based on the H-alpha flux, we found 327 active stars. We detected the LiI-6708 line and measure its equivalent width for 1657 stars, both giants and stars on the main sequence. Regarding the latter, we performed a discrete age classification based on the atmospheric lithium abundance and the upper envelopes of a few open clusters. Among the giants, we found 195 Li-rich stars, 161 of which are reported here for the first time. No relationship is found between stellar rotation and lithium abundance, which allows us to rule out merger scenarios as the predominant explanation of the enrichment of Li in our sample. The fraction of Li-rich giants, about 4%, is higher than expected. △ Less

Submitted 10 May, 2022; originally announced May 2022.

Comments: 32 pages, 34 figures; accepted for publication in Astronomy & Astrophysics

Journal ref: A&A 664, A78 (2022)

arXiv:2204.05080 [pdf, other]

Semantic Exploration from Language Abstractions and Pretrained Representations

Authors: Allison C. Tam, Neil C. Rabinowitz, Andrew K. Lampinen, Nicholas A. Roy, Stephanie C. Y. Chan, DJ Strouse, Jane X. Wang, Andrea Banino, Felix Hill

Abstract: Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty using semantically meaningful state abstractions, which can be found in learned representations shaped by natural language. In particular, we evaluat… ▽ More Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty using semantically meaningful state abstractions, which can be found in learned representations shaped by natural language. In particular, we evaluate vision-language representations, pretrained on natural image captioning datasets. We show that these pretrained representations drive meaningful, task-relevant exploration and improve performance on 3D simulated environments. We also characterize why and how language provides useful abstractions for exploration by considering the impacts of using representations from a pretrained model, a language oracle, and several ablations. We demonstrate the benefits of our approach in two very different task domains -- one that stresses the identification and manipulation of everyday objects, and one that requires navigational exploration in an expansive world. Our results suggest that using language-shaped representations could improve exploration for various algorithms and agents in challenging environments. △ Less

Submitted 26 April, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

Comments: NeurIPS 2022

arXiv:2204.02329 [pdf, other]

Can language models learn from explanations in context?

Authors: Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill

Abstract: Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, explanations that connect examples to task principles can improve learning. We therefore investigate whether explanations of few-shot examples can help LMs. We annotate questions from 40 challenging tasks with answer explanations, and various matched control explanations. We evaluate how different typ… ▽ More Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, explanations that connect examples to task principles can improve learning. We therefore investigate whether explanations of few-shot examples can help LMs. We annotate questions from 40 challenging tasks with answer explanations, and various matched control explanations. We evaluate how different types of explanations, instructions, and controls affect zero- and few-shot performance. We analyze these results using statistical multilevel modeling techniques that account for the nested dependencies among conditions, tasks, prompts, and models. We find that explanations can improve performance -- even without tuning. Furthermore, explanations hand-tuned for performance on a small validation set offer substantially larger benefits, and building a prompt by selecting examples and explanations together substantially improves performance over selecting examples alone. Finally, even untuned explanations outperform carefully matched controls, suggesting that the benefits are due to the link between an example and its explanation, rather than lower-level features. However, only large models benefit. In summary, explanations can support the in-context learning of large LMs on challenging tasks. △ Less

Submitted 10 October, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: Findings of EMNLP 2022

arXiv:2201.11050 [pdf, other]

Response of the Fe K_alpha line emission to the X-ray continuum variability in the changing-look active galactic nucleus NGC 1566

Authors: W. C. Liang, X. W. Shu, J. X. Wang, Y. Tan, W. J. Zhang, L. M. Sun, N. Jiang, L. M. Dou

Abstract: NGC 1566 is a changing look AGN known to exhibit recurrent X-ray outbursts with each lasting for several years. The most recent X-ray outburst is observed on 2018, with a substantial increase of 2--10 keV flux by a factor of ~24 than the historical minimum. We re-analyze the XMM-Newton and NuSTAR observations covering the pre-outburst, outburst and post-outburst epochs, and confirm the discovery o… ▽ More NGC 1566 is a changing look AGN known to exhibit recurrent X-ray outbursts with each lasting for several years. The most recent X-ray outburst is observed on 2018, with a substantial increase of 2--10 keV flux by a factor of ~24 than the historical minimum. We re-analyze the XMM-Newton and NuSTAR observations covering the pre-outburst, outburst and post-outburst epochs, and confirm the discovery of the broad feature in the ~5--7 keV band during the period of outburst that could be interpreted as a relativistic Fe K_alpha emission line. Our analysis suggests that its flux has increased in tandem with the 2--10 keV continuum, making it the second changing look AGN in which the broad Fe K_alpha line responds to the X-ray continuum variability. This behavior strongly supports the idea that X-rays originates in a corona above the accretion disk, and disk reflection produces the relativistic Fe K_alpha line. In addition, we find the response of narrow Fe K_alpha emission line to the changes in the X-ray continuum on a time-scale as short as four months, allowing to put the location of line-emitting region at <0.1 pc, comparable to the size of optical BLR. By comparing to the changing look AGN NGC 2992, the Fe K_alpha variation rate (the ratio of Fe K_alpha variation to luminosity variation) in NGC 1566 appears greater, which could be possibly explained by larger amount of gas or Fe abundance responsible for producing the Fe K_alpha line for the latter. The strength of variable broad Fe K_alpha line as well as the soft X-ray excess emission appears to be correlated with the accretion rate, which could be explained as due to the state transition associated with the changing-look phenomenon. △ Less

Submitted 26 January, 2022; originally announced January 2022.

Comments: Accepted for publication in JHEAp, 21 pages, 13 figures and 3 Tables

arXiv:2112.03753 [pdf, other]

Tell me why! Explanations support learning relational and causal structure

Authors: Andrew K. Lampinen, Nicholas A. Roy, Ishita Dasgupta, Stephanie C. Y. Chan, Allison C. Tam, James L. McClelland, Chen Yan, Adam Santoro, Neil C. Rabinowitz, Jane X. Wang, Felix Hill

Abstract: Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational a… ▽ More Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational and causal knowledge, augmenting their experience by training them to predict language descriptions and explanations can overcome these limitations. We show that language can help agents learn challenging relational tasks, and examine which aspects of language contribute to its benefits. We then show that explanations can help agents to infer not only relational but also causal structure. Language can shape the way that agents to generalize out-of-distribution from ambiguous, causally-confounded training, and explanations even allow agents to learn to perform experimental interventions to identify causal relationships. Our results suggest that language description and explanation may be powerful tools for improving agent learning and generalization. △ Less

Submitted 25 May, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

Comments: ICML 2022; 23 pages

ACM Class: I.2.6

arXiv:2109.08807 [pdf]

The Report on China-Spain Joint Clinical Testing for Rapid COVID-19 Risk Screening by Eye-region Manifestations

Authors: Yanwei Fu, Feng Li, Paula boned Fustel, Lei Zhao, Lijie Jia, Haojie Zheng, Qiang Sun, Shisong Rong, Haicheng Tang, Xiangyang Xue, Li Yang, Hong Li, Jiao Xie Wenxuan Wang, Yuan Li, Wei Wang, Yantao Pei, Jianmin Wang, Xiuqi Wu, Yanhua Zheng, Hongxia Tian, Mengwei Gu

Abstract: Background: The worldwide surge in coronavirus cases has led to the COVID-19 testing demand surge. Rapid, accurate, and cost-effective COVID-19 screening tests working at a population level are in imperative demand globally. Methods: Based on the eye symptoms of COVID-19, we developed and tested a COVID-19 rapid prescreening model using the eye-region images captured in China and Spain with cell… ▽ More Background: The worldwide surge in coronavirus cases has led to the COVID-19 testing demand surge. Rapid, accurate, and cost-effective COVID-19 screening tests working at a population level are in imperative demand globally. Methods: Based on the eye symptoms of COVID-19, we developed and tested a COVID-19 rapid prescreening model using the eye-region images captured in China and Spain with cellphone cameras. The convolutional neural networks (CNNs)-based model was trained on these eye images to complete binary classification task of identifying the COVID-19 cases. The performance was measured using area under receiver-operating-characteristic curve (AUC), sensitivity, specificity, accuracy, and F1. The application programming interface was open access. Findings: The multicenter study included 2436 pictures corresponding to 657 subjects (155 COVID-19 infection, 23.6%) in development dataset (train and validation) and 2138 pictures corresponding to 478 subjects (64 COVID-19 infections, 13.4%) in test dataset. The image-level performance of COVID-19 prescreening model in the China-Spain multicenter study achieved an AUC of 0.913 (95% CI, 0.898-0.927), with a sensitivity of 0.695 (95% CI, 0.643-0.748), a specificity of 0.904 (95% CI, 0.891 -0.919), an accuracy of 0.875(0.861-0.889), and a F1 of 0.611(0.568-0.655). Interpretation: The CNN-based model for COVID-19 rapid prescreening has reliable specificity and sensitivity. This system provides a low-cost, fully self-performed, non-invasive, real-time feedback solution for continuous surveillance and large-scale rapid prescreening for COVID-19. Funding: This project is supported by Aimomics (Shanghai) Intelligent △ Less

Submitted 17 September, 2021; originally announced September 2021.

arXiv:2102.02926 [pdf, other]

Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents

Authors: Jane X. Wang, Michael King, Nicolas Porcel, Zeb Kurth-Nelson, Tina Zhu, Charlie Deck, Peter Choy, Mary Cassin, Malcolm Reynolds, Francis Song, Gavin Buttimore, David P. Reichert, Neil Rabinowitz, Loic Matthey, Demis Hassabis, Alexander Lerchner, Matthew Botvinick

Abstract: There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled anal… ▽ More There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled analysis. In the present work, we introduce a new benchmark for meta-RL research, emphasizing transparency and potential for in-depth analysis as well as structural richness. Alchemy is a 3D video game, implemented in Unity, which involves a latent causal structure that is resampled procedurally from episode to episode, affording structure learning, online inference, hypothesis testing and action sequencing based on abstract domain knowledge. We evaluate a pair of powerful RL agents on Alchemy and present an in-depth analysis of one of these agents. Results clearly indicate a frank and specific failure of meta-learning, providing validation for Alchemy as a challenging benchmark for meta-RL. Concurrent with this report, we are releasing Alchemy as public resource, together with a suite of analysis tools and sample agent trajectories. △ Less

Submitted 20 October, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: Published in Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 2021

arXiv:2011.13464 [pdf, other]

Meta-learning in natural and artificial intelligence

Authors: Jane X. Wang

Abstract: Meta-learning, or learning to learn, has gained renewed interest in recent years within the artificial intelligence community. However, meta-learning is incredibly prevalent within nature, has deep roots in cognitive science and psychology, and is currently studied in various forms within neuroscience. The aim of this review is to recast previous lines of research in the study of biological intell… ▽ More Meta-learning, or learning to learn, has gained renewed interest in recent years within the artificial intelligence community. However, meta-learning is incredibly prevalent within nature, has deep roots in cognitive science and psychology, and is currently studied in various forms within neuroscience. The aim of this review is to recast previous lines of research in the study of biological intelligence within the lens of meta-learning, placing these works into a common framework. More recent points of interaction between AI and neuroscience will be discussed, as well as interesting new directions that arise under this perspective. △ Less

Submitted 26 November, 2020; originally announced November 2020.

arXiv:2010.02255 [pdf, other]

Temporal Difference Uncertainties as a Signal for Exploration

Authors: Sebastian Flennerhag, Jane X. Wang, Pablo Sprechmann, Francesco Visin, Alexandre Galashov, Steven Kapturowski, Diana L. Borsa, Nicolas Heess, Andre Barreto, Razvan Pascanu

Abstract: An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that involve function approximators, obtaining accurate uncertainty estimates is almost as challenging a problem. In this paper, we highlight that value estimates are ea… ▽ More An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that involve function approximators, obtaining accurate uncertainty estimates is almost as challenging a problem. In this paper, we highlight that value estimates are easily biased and temporally inconsistent. In light of this, we propose a novel method for estimating uncertainty over the value function that relies on inducing a distribution over temporal difference errors. This exploration signal controls for state-action transitions so as to isolate uncertainty in value that is due to uncertainty over the agent's parameters. Because our measure of uncertainty conditions on state-action transitions, we cannot act on this measure directly. Instead, we incorporate it as an intrinsic reward and treat exploration as a separate learning problem, induced by the agent's temporal difference uncertainties. We introduce a distinct exploration policy that learns to collect data with high estimated uncertainty, which gives rise to a curriculum that smoothly changes throughout learning and vanishes in the limit of perfect value estimates. We evaluate our method on hard exploration tasks, including Deep Sea and Atari 2600 environments and find that our proposed form of exploration facilitates both diverse and deep exploration. △ Less

Submitted 1 July, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

Comments: 9 pages, 11 figures, 5 tables

arXiv:2008.09301 [pdf, other]

Amortized learning of neural causal representations

Authors: Nan Rosemary Ke, Jane. X. Wang, Jovana Mitrovic, Martin Szummer, Danilo J. Rezende

Abstract: Causal models can compactly and efficiently encode the data-generating process under all interventions and hence may generalize better under changes in distribution. These models are often represented as Bayesian networks and learning them scales poorly with the number of variables. Moreover, these approaches cannot leverage previously learned knowledge to help with learning new causal models. In… ▽ More Causal models can compactly and efficiently encode the data-generating process under all interventions and hence may generalize better under changes in distribution. These models are often represented as Bayesian networks and learning them scales poorly with the number of variables. Moreover, these approaches cannot leverage previously learned knowledge to help with learning new causal models. In order to tackle these challenges, we represent a novel algorithm called \textit{causal relational networks} (CRN) for learning causal models using neural networks. The CRN represent causal models using continuous representations and hence could scale much better with the number of variables. These models also take in previously learned information to facilitate learning of new causal models. Finally, we propose a decoding-based metric to evaluate causal models with continuous representations. We test our method on synthetic data achieving high accuracy and quick adaptation to previously unseen causal models. △ Less

Submitted 21 August, 2020; originally announced August 2020.

Comments: ICLR 2020 causal learning for decision making workshop

arXiv:2007.03750 [pdf, other]

Deep Reinforcement Learning and its Neuroscientific Implications

Authors: Matthew Botvinick, Jane X. Wang, Will Dabney, Kevin J. Miller, Zeb Kurth-Nelson

Abstract: The emergence of powerful artificial intelligence is defining new research directions in neuroscience. To date, this research has focused largely on deep neural networks trained using supervised learning, in tasks such as image classification. However, there is another area of recent AI work which has so far received less attention from neuroscientists, but which may have profound neuroscientific… ▽ More The emergence of powerful artificial intelligence is defining new research directions in neuroscience. To date, this research has focused largely on deep neural networks trained using supervised learning, in tasks such as image classification. However, there is another area of recent AI work which has so far received less attention from neuroscientists, but which may have profound neuroscientific implications: deep reinforcement learning. Deep RL offers a comprehensive framework for studying the interplay among learning, representation and decision-making, offering to the brain sciences a new set of research tools and a wide range of novel hypotheses. In the present review, we provide a high-level introduction to deep RL, discuss some of its initial applications to neuroscience, and survey its wider implications for research on brain and behavior, concluding with a list of opportunities for next-stage research. △ Less

Submitted 7 July, 2020; originally announced July 2020.

Comments: 22 pages, 5 figures

arXiv:1905.03030 [pdf, other]

Meta-learning of Sequential Strategies

Authors: Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred… ▽ More In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics. Essentially, memory-based meta-learning translates the hard problem of probabilistic sequential inference into a regression problem. △ Less

Submitted 18 July, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

Comments: DeepMind Technical Report (15 pages, 6 figures). Version V1.1

arXiv:1811.05931 [pdf, other]

Evolving intrinsic motivations for altruistic behavior

Authors: Jane X. Wang, Edward Hughes, Chrisantha Fernando, Wojciech M. Czarnecki, Edgar A. Duenez-Guzman, Joel Z. Leibo

Abstract: Multi-agent cooperation is an important feature of the natural world. Many tasks involve individual incentives that are misaligned with the common good, yet a wide range of organisms from bacteria to insects and humans are able to overcome their differences and collaborate. Therefore, the emergence of cooperative behavior amongst self-interested individuals is an important question for the fields… ▽ More Multi-agent cooperation is an important feature of the natural world. Many tasks involve individual incentives that are misaligned with the common good, yet a wide range of organisms from bacteria to insects and humans are able to overcome their differences and collaborate. Therefore, the emergence of cooperative behavior amongst self-interested individuals is an important question for the fields of multi-agent reinforcement learning (MARL) and evolutionary theory. Here, we study a particular class of multi-agent problems called intertemporal social dilemmas (ISDs), where the conflict between the individual and the group is particularly sharp. By combining MARL with appropriately structured natural selection, we demonstrate that individual inductive biases for cooperation can be learned in a model-free way. To achieve this, we introduce an innovative modular architecture for deep reinforcement learning agents which supports multi-level selection. We present results in two challenging environments, and interpret these in the context of cultural and ecological evolution. △ Less

Submitted 11 March, 2019; v1 submitted 14 November, 2018; originally announced November 2018.

Comments: 10 pages, 6 figures. In Proc. of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019)

arXiv:1810.09465 [pdf, other]

doi 10.3847/1538-4357/aaea60

Variability-selected low-luminosity active galactic nuclei candidates in the 7 Ms Chandra Deep Field-South

Authors: N. Ding, B. Luo, W. N. Brandt, M. Paolillo, G. Yang, B. D. Lehmer, O. Shemmer, D. P. Schneider, P. Tozzi, Y. Q. Xue, X. C. Zheng, Q. S. Gu, A. M. Koekemoer, C. Vignali, F. Vito, J. X. Wang

Abstract: In deep X-ray surveys, active galactic nuclei (AGNs) with a broad range of luminosities have been identified. However, cosmologically distant low-luminosity AGN (LLAGN, $L_{\mathrm{X}} \lesssim 10^{42}$ erg s$^{-1}$) identification still poses a challenge due to significant contamination from host galaxies. Based on the 7 Ms Chandra Deep Field-South (CDF-S) survey, the longest timescale (… ▽ More In deep X-ray surveys, active galactic nuclei (AGNs) with a broad range of luminosities have been identified. However, cosmologically distant low-luminosity AGN (LLAGN, $L_{\mathrm{X}} \lesssim 10^{42}$ erg s$^{-1}$) identification still poses a challenge due to significant contamination from host galaxies. Based on the 7 Ms Chandra Deep Field-South (CDF-S) survey, the longest timescale ($\sim 17$ years) deep X-ray survey to date, we utilize an X-ray variability selection technique to search for LLAGNs that remain unidentified among the CDF-S X-ray sources. We find 13 variable sources from 110 unclassified CDF-S X-ray sources. Except for one source which could be an ultraluminous X-ray source, the variability of the remaining 12 sources is most likely due to accreting supermassive black holes. These 12 AGN candidates have low intrinsic X-ray luminosities, with a median value of $7 \times10^{40}$ erg s$^{-1}$. They are generally not heavily obscured, with an average effective power-law photon index of 1.8. The fraction of variable AGNs in the CDF-S is independent of X-ray luminosity and is only restricted by the total number of observed net counts, confirming previous findings that X-ray variability is a near-ubiquitous property of AGNs over a wide range of luminosities. There is an anti-correlation between X-ray luminosity and variability amplitude for high-luminosity AGNs, but as the luminosity drops to $\lesssim 10^{42}$ erg s$^{-1}$, the variability amplitude no longer appears dependent on the luminosity. The entire observed luminosity-variability trend can be roughly reproduced by an empirical AGN variability model based on a broken power-law power spectral density function. △ Less

Submitted 22 October, 2018; originally announced October 2018.

Comments: 18 pages, 11 figures, accepted for publication in ApJ

arXiv:1809.00319 [pdf, ps, other]

doi 10.3847/2041-8213/aaba17

A long decay of X-ray flux and spectral evolution in the supersoft active galactic nucleus GSN 069

Authors: X. W. Shu, S. S. Wang, L. M. Dou, N. Jiang, J. X. Wang, T. G. Wang

Abstract: GSN 069 is an optically identified very low-mass AGN which shows supersoft X-ray emission. The source is known to exhibit huge X-ray outburst, with flux increased by more than a factor of ~240 compared to the quiescence state. We report its long-term evolution in the X-ray flux and spectral variations over a time-scale of ~decade, using both new and archival X-ray observations from the XMM and Swi… ▽ More GSN 069 is an optically identified very low-mass AGN which shows supersoft X-ray emission. The source is known to exhibit huge X-ray outburst, with flux increased by more than a factor of ~240 compared to the quiescence state. We report its long-term evolution in the X-ray flux and spectral variations over a time-scale of ~decade, using both new and archival X-ray observations from the XMM and Swift. The new Swift observations detected the source in its lowest level of X-ray activity since outburst, a factor of ~4 lower in the 0.2-2 keV flux than that obtained with the XMM observations nearly 8 years ago. Combining with the historical X-ray measurements, we find that the X-ray flux is decreasing slowly. There seemed to be spectral softening associated with the drop of X-ray flux. In addition, we find evidence for the presence of a weak, variable hard X-ray component, in addition to the dominant thermal blackbody emission reported before. The long decay of X-ray flux and spectral evolution, as well as the supersoft X-ray spectra, suggest that the source could be a tidal disruption event, though a highly variable AGN cannot be fully ruled out. Further continued X-ray monitoring would be required to test the TDE interpretation, through better determining the flux evolution in the decay phase. △ Less

Submitted 2 September, 2018; originally announced September 2018.

Comments: 7 pages, 4 figures, 1 table, published in the ApJ letters

Journal ref: 2018, ApJL, 857, L16

arXiv:1809.00318 [pdf, ps, other]

doi 10.1051/0004-6361/201833434

A unique distant submillimeter galaxy with an X-ray-obscured radio-luminous active galactic nucleus

Authors: X. W. Shu, Y. Q. Xue, D. Z. Liu, T. Wang, Y. K. Han, Y. Y. Chang, T. Liu, X. X. Huang, J. X. Wang, X. Z. Zheng, E. da Cunha, E. Daddi, D. Elbaz

Abstract: We present a multiwavelength study of an atypical submillimeter galaxy in the GOODS-North field, with the aim to understand its physical properties of stellar and dust emission, as well as the central AGN activity. Although it is shown that the source is likely an extremely dusty galaxy at high redshift, its exact position of submillimeter emission is unknown. With the new NOEMA interferometric im… ▽ More We present a multiwavelength study of an atypical submillimeter galaxy in the GOODS-North field, with the aim to understand its physical properties of stellar and dust emission, as well as the central AGN activity. Although it is shown that the source is likely an extremely dusty galaxy at high redshift, its exact position of submillimeter emission is unknown. With the new NOEMA interferometric imaging, we confirm that the source is a unique dusty galaxy. It has no obvious counterpart in the optical and even NIR images observed with HST at lambda~<1.4um. Photometric-redshift analyses from both stellar and dust SED suggest it to likely be at z~>4, though a lower redshift at z~>3.1 cannot be fully ruled out (at 90% confidence interval). Explaining its unusual optical-to-NIR properties requires an old stellar population (~0.67 Gyr), coexisting with a very dusty ongoing starburst component. The latter is contributing to the FIR emission, with its rest-frame UV and optical light being largely obscured along our line of sight. If the observed fluxes at the rest-frame optical/NIR wavelengths were mainly contributed by old stars, a total stellar mass of ~3.5x10^11Msun would be obtained. An X-ray spectral analysis suggests that this galaxy harbors a heavily obscured AGN with N_H=3.3x10^23 cm^-2 and an intrinsic 2-10 keV luminosity of L_X~2.6x10^44 erg/s, which places this object among distant type 2 quasars. The radio emission of the source is extremely bright, which is an order of magnitude higher than the star-formation-powered emission, making it one of the most distant radio-luminous dusty galaxies. The combined characteristics of the galaxy suggest that the source appears to have been caught in a rare but critical transition stage in the evolution of submillimeter galaxies, where we are witnessing the birth of a young AGN and possibly the earliest stage of its jet formation and feedback. △ Less

Submitted 2 September, 2018; originally announced September 2018.

Comments: 13 pages in printer format, 10 figures, 1 table, accepted for publication in the A&A

Journal ref: A&A 619, A76 (2018)

arXiv:1805.09692 [pdf, other]

Been There, Done That: Meta-Learning with Episodic Recall

Authors: Samuel Ritter, Jane X. Wang, Zeb Kurth-Nelson, Siddhant M. Jayakumar, Charles Blundell, Razvan Pascanu, Matthew Botvinick

Abstract: Meta-learning agents excel at rapidly learning new tasks from open-ended task distributions; yet, they forget what they learn about each task as soon as the next begins. When tasks reoccur - as they do in natural environments - metalearning agents must explore again instead of immediately exploiting previously discovered solutions. We propose a formalism for generating open-ended yet repetitious e… ▽ More Meta-learning agents excel at rapidly learning new tasks from open-ended task distributions; yet, they forget what they learn about each task as soon as the next begins. When tasks reoccur - as they do in natural environments - metalearning agents must explore again instead of immediately exploiting previously discovered solutions. We propose a formalism for generating open-ended yet repetitious environments, then develop a meta-learning architecture for solving these environments. This architecture melds the standard LSTM working memory with a differentiable neural episodic memory. We explore the capabilities of agents with this episodic LSTM in five meta-learning environments with reoccurring tasks, ranging from bandits to navigation and stochastic sequential decision problems. △ Less

Submitted 6 July, 2018; v1 submitted 24 May, 2018; originally announced May 2018.

Comments: ICML 2018

arXiv:1710.04358 [pdf, other]

doi 10.3847/1538-4357/aa9378

Deepest view of AGN X-ray variability with the 7 Ms Chandra Deep Field-South survey

Authors: X. C. Zheng, Y. Q. Xue, W. N. Brandt, J. Y. Li, M. Paolillo, G. Yang, S. F. Zhu, B. Luo, M. Y. Sun, T. M. Hughes, F. E. Bauer, F. Vito, J. X. Wang, T. Liu, C. Vignali, X. W. Shu

Abstract: We systematically analyze X-ray variability of active galactic nuclei (AGNs) in the 7~Ms \textit{Chandra} Deep Field-South survey. On the longest timescale ($\approx~17$ years), we find only weak (if any) dependence of X-ray variability amplitudes on energy bands or obscuration. We use four different power spectral density (PSD) models to fit the anti-correlation between normalized excess variance… ▽ More We systematically analyze X-ray variability of active galactic nuclei (AGNs) in the 7~Ms \textit{Chandra} Deep Field-South survey. On the longest timescale ($\approx~17$ years), we find only weak (if any) dependence of X-ray variability amplitudes on energy bands or obscuration. We use four different power spectral density (PSD) models to fit the anti-correlation between normalized excess variance ($σ^2_{\rm nxv}$) and luminosity, and obtain a best-fit power law index $β=1.16^{+0.05}_{-0.05}$ for the low-frequency part of AGN PSD. We also divide the whole light curves into 4 epochs in order to inspect the dependence of $σ^2_{\rm nxv}$ on these timescales, finding an overall increasing trend. The analysis of these shorter light curves also infers a $β$ of $\sim 1.3$ that is consistent with the above-derived $β$, which is larger than the frequently-assumed value of $β=1$. We then investigate the evolution of $σ^2_{\rm nxv}$. No definitive conclusion is reached due to limited source statistics but, if present, the observed trend goes in the direction of decreasing AGN variability at fixed luminosity toward large redshifts. We also search for transient events and find 6 notable candidate events with our considered criteria. Two of them may be a new type of fast transient events, one of which is reported here for the first time. We therefore estimate a rate of fast outbursts $\langle\dot{N}\rangle = 1.0^{+1.1}_{-0.7}\times 10^{-3}~\rm galaxy^{-1}~yr^{-1}$ and a tidal disruption event~(TDE) rate $\langle\dot{N}_{\rm TDE}\rangle=8.6^{+8.5}_{-4.9}\times 10^{-5}~\rm galaxy^{-1}~yr^{-1}$ assuming the other four long outbursts to be TDEs. △ Less

Submitted 12 October, 2017; originally announced October 2017.

Comments: 20 pages, 16 figures. Accepted for publication in ApJ

arXiv:1707.05332 [pdf, ps, other]

doi 10.1093/mnras/stx1761

Tracing the accretion history of supermassive Black Holes through X-ray variability: results from the Chandra Deep Field-South

Authors: M. Paolillo, I. Papadakis, W. N. Brandt, B. Luo, Y. Q. Xue, P. Tozzi, O. Shemmer, V. Allevato, F. E. Bauer, A. Comastri, R. Gilli, A. Koekemoer, T. Liu, C. Vignali, F. Vito, G. Yang, J. X. Wang, X. C. Zheng

Abstract: We study the X-ray variability properties of distant AGNs in the Chandra Deep Field-South region over 17 years, up to $z\sim 4$, and compare them with those predicted by models based on local samples. We use the results of Monte Carlo simulations to account for the biases introduced by the discontinuous sampling and the low-count regime. We confirm that variability is an ubiquitous property of AGN… ▽ More We study the X-ray variability properties of distant AGNs in the Chandra Deep Field-South region over 17 years, up to $z\sim 4$, and compare them with those predicted by models based on local samples. We use the results of Monte Carlo simulations to account for the biases introduced by the discontinuous sampling and the low-count regime. We confirm that variability is an ubiquitous property of AGNs, with no clear dependence on the density of the environment. The variability properties of high-z AGNs, over different temporal timescales, are most consistent with a Power Spectral Density (PSD) described by a broken (or bending) power-law, similar to nearby AGNs. We confirm the presence of an anti-correlation between luminosity and variability, resulting from the dependence of variability on BH mass and accretion rate. We explore different models, finding that our acceptable solutions predict that BH mass influences the value of the PSD break frequency, while the Eddington ratio $λ_{Edd}$ affects the PSD break frequency and, possibly, the PSD amplitude as well. We derive the evolution of the average $λ_{Edd}$ as a function of redshift, finding results in agreement with measurements based on different estimators. The large statistical uncertainties make our results consistent with a constant Eddington ratio, although one of our models suggest a possible increase of $λ_{Edd}$ with lookback time up to $z\sim 2-3$. We conclude that variability is a viable mean to trace the accretion history of supermassive BHs, whose usefulness will increase with future, wide-field/large effective area X-ray missions. △ Less

Submitted 1 August, 2017; v1 submitted 17 July, 2017; originally announced July 2017.

Comments: 15 pages, 9 Figures, 2 Tables, in press on MNRAS

arXiv:1701.08484 [pdf, other]

doi 10.3847/1538-4357/aa5d55

Young Galaxy Candidates in the Hubble Frontier Fields IV. MACS J1149.5+2223

Authors: W. Zheng, A. Zitrin, L. Infante, N. Laporte, X. X. Huang, J. Moustakas, H. C. Ford, X. W. Shu, J. X. Wang, J. M. Diego, F. E. Bauer, P. T. Iribarren, T. Broadhurst, A. Molino

Abstract: We search for high-redshift dropout galaxies behind the Hubble Frontier Fields (HFF) galaxy cluster MACS J1149.5+2223, a powerful cosmic lens that has revealed a number of unique objects in its field. Using the deep images from the Hubble and Spitzer space telescopes, we find 11 galaxies at z>7 in the MACS J1149.5+2223 cluster field, and 11 in its parallel field. The high-redshift nature of the br… ▽ More We search for high-redshift dropout galaxies behind the Hubble Frontier Fields (HFF) galaxy cluster MACS J1149.5+2223, a powerful cosmic lens that has revealed a number of unique objects in its field. Using the deep images from the Hubble and Spitzer space telescopes, we find 11 galaxies at z>7 in the MACS J1149.5+2223 cluster field, and 11 in its parallel field. The high-redshift nature of the bright z~9.6 galaxy MACS1149-JD, previously reported by Zheng et al., is further supported by non-detection in the extremely deep optical images from the HFF campaign. With the new photometry, the best photometric redshift solution for MACS1149-JD reduces slightly to z=9.44 +/- 0.12. The young galaxy has an estimated stellar mass of (7 +/- 2)X10E8 Msun, and was formed at z=13.2 +1.9-1.6 when the universe was ~300 Myr old. Data available for the first four HFF clusters have already enabled us to find faint galaxies to an intrinsic magnitude of M(UV) ~ -15.5, approximately a factor of ten deeper than the parallel fields. △ Less

Submitted 14 February, 2017; v1 submitted 30 January, 2017; originally announced January 2017.

Comments: To appear in the Astrophysical Journal. 20 pages, 6 figures, 3 tables. arXiv admin note: text overlap with arXiv:1402.6743

arXiv:1611.05763 [pdf, other]

Learning to reinforcement learn

Authors: Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick

Abstract: In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this cha… ▽ More In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience. △ Less

Submitted 23 January, 2017; v1 submitted 17 November, 2016; originally announced November 2016.

Comments: 17 pages, 7 figures, 1 table

arXiv:1607.08823 [pdf]

doi 10.1117/12.2232034

eXTP -- enhanced X-ray Timing and Polarimetry Mission

Authors: S. N. Zhang, M. Feroci, A. Santangelo, Y. W. Dong, H. Feng, F. J. Lu, K. Nandra, Z. S. Wang, S. Zhang, E. Bozzo, S. Brandt, A. De Rosa, L. J. Gou, M. Hernanz, M. van der Klis, X. D. Li, Y. Liu, P. Orleanski, G. Pareschi, M. Pohl, J. Poutanen, J. L. Qu, S. Schanne, L. Stella, P. Uttley , et al. (160 additional authors not shown)

Abstract: eXTP is a science mission designed to study the state of matter under extreme conditions of density, gravity and magnetism. Primary targets include isolated and binary neutron stars, strong magnetic field systems like magnetars, and stellar-mass and supermassive black holes. The mission carries a unique and unprecedented suite of state-of-the-art scientific instruments enabling for the first time… ▽ More eXTP is a science mission designed to study the state of matter under extreme conditions of density, gravity and magnetism. Primary targets include isolated and binary neutron stars, strong magnetic field systems like magnetars, and stellar-mass and supermassive black holes. The mission carries a unique and unprecedented suite of state-of-the-art scientific instruments enabling for the first time ever the simultaneous spectral-timing-polarimetry studies of cosmic sources in the energy range from 0.5-30 keV (and beyond). Key elements of the payload are: the Spectroscopic Focusing Array (SFA) - a set of 11 X-ray optics for a total effective area of about 0.9 m^2 and 0.6 m^2 at 2 keV and 6 keV respectively, equipped with Silicon Drift Detectors offering <180 eV spectral resolution; the Large Area Detector (LAD) - a deployable set of 640 Silicon Drift Detectors, for a total effective area of about 3.4 m^2, between 6 and 10 keV, and spectral resolution <250 eV; the Polarimetry Focusing Array (PFA) - a set of 2 X-ray telescope, for a total effective area of 250 cm^2 at 2 keV, equipped with imaging gas pixel photoelectric polarimeters; the Wide Field Monitor (WFM) - a set of 3 coded mask wide field units, equipped with position-sensitive Silicon Drift Detectors, each covering a 90 degrees x 90 degrees FoV. The eXTP international consortium includes mostly major institutions of the Chinese Academy of Sciences and Universities in China, as well as major institutions in several European countries and the United States. The predecessor of eXTP, the XTP mission concept, has been selected and funded as one of the so-called background missions in the Strategic Priority Space Science Program of the Chinese Academy of Sciences since 2011. The strong European participation has significantly enhanced the scientific capabilities of eXTP. The planned launch date of the mission is earlier than 2025. △ Less

Submitted 29 July, 2016; originally announced July 2016.

Comments: 16 pages, 16 figures. Oral talk presented at SPIE Astronomical Telescopes and Instrumentation, June 26 to July 1, 2016, Edingurgh, UK

Journal ref: Science China Physics, Mechanics & Astronomy, 2019, Volume 62, Issue 2, article id. 29502, 25 pp

arXiv:1605.04278 [pdf, ps, other]

Universal Dependencies for Learner English

Authors: Yevgeni Berzak, Jessica Kenney, Carolyn Spadine, Jing Xian Wang, Lucia Lam, Keiko Sophie Mori, Sebastian Garza, Boris Katz

Abstract: We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic… ▽ More We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic analyses are provided for both the original and error corrected versions of each sentence. Further on, we delineate ESL annotation guidelines that allow for consistent syntactic treatment of ungrammatical English. Finally, we benchmark POS tagging and dependency parsing performance on the TLE dataset and measure the effect of grammatical errors on parsing accuracy. We envision the treebank to support a wide range of linguistic and computational research on second language acquisition as well as automatic processing of ungrammatical language. The treebank is available at universaldependencies.org. The annotation manual used in this project and a graphical query engine are available at esltreebank.org. △ Less

Submitted 7 June, 2016; v1 submitted 13 May, 2016; originally announced May 2016.

Comments: Updated parsing experiments to EWT v1.3, improved grammatical error marking, minor revisions. To appear in ACL 2016

arXiv:1512.00167 [pdf, ps, other]

doi 10.3847/0067-0049/222/1/4

Identification of z~>2 Herschel 500 micron sources using color-deconfusion

Authors: X. W. Shu, D. Elbaz, N. Bourne, C. Schreiber, T. Wang, J. S. Dunlop, A. Fontana, R. Leiton, M. Pannella, K. Okumura, M. J. Michalowski, P. Santini, E. Merlin, F. Buitrago, V. A. Bruce, R. Amorin, M. Castellano, S. Derriere, A. Comastri, N. Cappelluti, J. X. Wang, H. C. Ferguson

Abstract: We present a new method to search for candidate z~>2 Herschel 500μm sources in the GOODS-North field, using a S500μm/S24μm "color deconfusion" technique. Potential high-z sources are selected against low-redshift ones from their large 500μm to 24μm flux density ratios. By effectively reducing the contribution from low-redshift populations to the observed 500μm emission, we are able to identify cou… ▽ More We present a new method to search for candidate z~>2 Herschel 500μm sources in the GOODS-North field, using a S500μm/S24μm "color deconfusion" technique. Potential high-z sources are selected against low-redshift ones from their large 500μm to 24μm flux density ratios. By effectively reducing the contribution from low-redshift populations to the observed 500μm emission, we are able to identify counterparts to high-z 500μm sources whose 24μm fluxes are relatively faint. The recovery of known z~4 starbursts confirms the efficiency of this approach in selecting high-z Herschel sources. The resulting sample consists of 34 dusty star-forming galaxies at z~>2. The inferred infrared luminosities are in the range 1.5x10^12-1.8x10^13 Lsun, corresponding to dust-obscured star formation rates (SFRs) of ~260-3100 Msun/yr for a Salpeter IMF. Comparison with previous SCUBA 850μm-selected galaxy samples shows that our method is more efficient at selecting high-z dusty galaxies with a median redshift of z=3.07+/-0.83 and 10 of the sources at z~>4. We find that at a fixed luminosity, the dust temperature is ~5K cooler than that expected from the Td-LIR relation at z<1, though different temperature selection effects should be taken into account. The radio-detected subsample (excluding three strong AGN) follows the far-infrared/radio correlation at lower redshifts, and no evolution with redshift is observed out to z~5, suggesting that the far-infrared emission is star formation dominated. The contribution of the high-z Herschel 500μm sources to the cosmic SFR density is comparable to that of SMG populations at z~2.5 and at least 40% of the extinction-corrected UV samples at z~4 (abridged). △ Less

Submitted 1 December, 2015; originally announced December 2015.

Comments: 33 pages in emulateapj format, 24 figures, 2 tables, accepted for publication in the ApJS

arXiv:1405.3373 [pdf, ps, other]

doi 10.1016/j.physletb.2014.08.033

Breakdown of QCD Factorization for P-Wave Quarkonium Production at Low Transverse Momentum

Authors: J. P. Ma, J. X. Wang, S. Zhao

Abstract: Quarkonium production at low transverse momentum in hadron collisions can be used to extract Transverse-Momentum-Dependent(TMD) gluon distribution functions, if TMD factorization holds there. We show that TMD factorization for the case of P-wave quarkonium with $J^{PC}=0^{++}, 2^{++}$ holds at one-loop level, but is violated beyond one-loop level. TMD factorization for other P-wave quarkonium is a… ▽ More Quarkonium production at low transverse momentum in hadron collisions can be used to extract Transverse-Momentum-Dependent(TMD) gluon distribution functions, if TMD factorization holds there. We show that TMD factorization for the case of P-wave quarkonium with $J^{PC}=0^{++}, 2^{++}$ holds at one-loop level, but is violated beyond one-loop level. TMD factorization for other P-wave quarkonium is also violated already at one-loop. △ Less

Submitted 27 August, 2014; v1 submitted 14 May, 2014; originally announced May 2014.

Comments: Published version in Physics Letters B (2014), pp. 103-108

arXiv:1403.5136 [pdf]

doi 10.1063/1.4891979

Reversing ferroelectric polarization in multiferroic DyMn2O5 by nonmagnetic Al substitution of Mn

Authors: Z. Y. Zhao, M. F. Liu, X. Li, J. X. Wang, Z. B. Yan, K. F. Wang, J. -M. Liu

Abstract: The multiferroic RMn2O5 family, where R is rare-earth ion or Y, exhibits rich physics of multiferroicity which has not yet well understood, noting that multiferroicity is receiving attentions for promising application potentials. DyMn2O5 is a representative member of this family. The ferroelectric polarization in DyMn2O5 is claimed to have two anti-parallel components: one (PDM) from the symmetric… ▽ More The multiferroic RMn2O5 family, where R is rare-earth ion or Y, exhibits rich physics of multiferroicity which has not yet well understood, noting that multiferroicity is receiving attentions for promising application potentials. DyMn2O5 is a representative member of this family. The ferroelectric polarization in DyMn2O5 is claimed to have two anti-parallel components: one (PDM) from the symmetric exchange striction between the Dy3+-Mn4+ interactions and the other (PMM) from the symmetric exchange striction between the Mn3+-Mn4+ interactions. We investigate the evolutions of the two components upon a partial substitution of Mn3+ by nonmagnetic Al3+ in order to tailor the Mn-Mn interactions and then to modulate component PMM in DyMn2-x/2Alx/2O5. It is revealed that the ferroelectric polarization can be successfully reversed by the Al-substitution via substantially suppressing the Mn3+-Mn4+ interactions and thus the PMM. The Dy3+-Mn4+ interactions and the polarization component PDM can sustain against the substitution until a level as high as x=0.2. In addition, the independent Dy spin ordering is shifted remarkably down to an extremely low temperature due to the Al3+ substitution. The present work not only confirms the existence of the two anti-parallel polarization components but also unveils the possibility of tailoring them independently. △ Less

Submitted 20 March, 2014; originally announced March 2014.

Comments: 30 pages, 11 figures

arXiv:1402.0422 [pdf, other]

A high-reproducibility and high-accuracy method for automated topic classification

Authors: Andrea Lancichinetti, M. Irmak Sirer, Jane X. Wang, Daniel Acuna, Konrad Körding, Luís A. Nunes Amaral

Abstract: Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent search, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA) is the state-of-the-art in topic classification. Here, we perf… ▽ More Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent search, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA) is the state-of-the-art in topic classification. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results which are not accurate in inferring the most suitable model parameters. Adapting approaches for community detection in networks, we propose a new algorithm which displays high-reproducibility and high-accuracy, and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure. Our algorithm promises to make "big data" text analysis systems more reliable. △ Less

Submitted 3 February, 2014; originally announced February 2014.

Comments: 23 pages, 24 figures

arXiv:1312.5816 [pdf, ps, other]

doi 10.1002/2013JA019291

Variation of the solar magnetic flux spectrum during solar cycle 23

Authors: C. l. Jin, J. X. Wang

Abstract: By using the unique database of SOHO/MDI full disk magnetograms from 1996 September to 2011 January, covering the entire solar cycle 23, we analyze the time-variability of the solar magnetic flux spectrum and study the properties of extended minimum of cycle 23. We totally identify 11.5 million magnetic structures. It has been revealed that magnetic features with different magnetic fluxes exhibit… ▽ More By using the unique database of SOHO/MDI full disk magnetograms from 1996 September to 2011 January, covering the entire solar cycle 23, we analyze the time-variability of the solar magnetic flux spectrum and study the properties of extended minimum of cycle 23. We totally identify 11.5 million magnetic structures. It has been revealed that magnetic features with different magnetic fluxes exhibit different cycle behaviors. The magnetic features with flux larger than $4.0 \times 10^{19}$ Mx, which cover solar active regions and strong network features, show exactly the same variation as sunspots; However, the remaining $82\%$ magnetic features which cover the majority of network elements show anti-phase variation with sunspots. We select a riterion that the monthly sunspot number is less than 20 to represent the Sun's low activity status. Then we find the extended minimum of cycle 23 is characterized by the long duration of low activity status, but the magnitude of magnetic flux in this period is not lower than previous cycle. Both the duration of low activity status and the minimum activity level defined by minimum sunspot number show a century period approximately. The extended minimum of cycle 23 shows similarities with solar cycle 11, which preceded the mini-maxima in later solar cycles. This similarity is suggestive that the solar cycles following cycle 23 are likely to have low activity. △ Less

Submitted 20 December, 2013; originally announced December 2013.

Comments: 24 pages, 7 figures, accepted by JGR in 2013

arXiv:1211.7144 [pdf, ps, other]

doi 10.1103/PhysRevD.88.014027

Transverse Momentum Dependent Factorization for Quarkonium Production at Low Transverse Momentum

Authors: J. P. Ma, J. X. Wang, S. Zhao

Abstract: Quarkonium production in hadron collisions at low transverse momentum $q_\perp \ll M$ with $M$ as the quarkonium mass can be used for probing transverse momentum dependent (TMD) gluon distributions. For this purpose, one needs to establish the TMD factorization for the process. We examine the factorization at the one-loop level for the production of $η_c$ or $η_b$. The perturbative coefficient in… ▽ More Quarkonium production in hadron collisions at low transverse momentum $q_\perp \ll M$ with $M$ as the quarkonium mass can be used for probing transverse momentum dependent (TMD) gluon distributions. For this purpose, one needs to establish the TMD factorization for the process. We examine the factorization at the one-loop level for the production of $η_c$ or $η_b$. The perturbative coefficient in the factorization is determined at one-loop accuracy. Comparing the factorization derived at tree level and that beyond the tree level, a soft factor is, in general, needed to completely cancel soft divergences. We have also discussed possible complications of TMD factorization of p-wave quarkonium production. △ Less

Submitted 12 August, 2013; v1 submitted 29 November, 2012; originally announced November 2012.

Comments: Title changed in the journal, published version

arXiv:1205.6533 [pdf, ps, other]

doi 10.1051/0004-6361/201118037

Quantifying solar superactive regions with vector magnetic field observations

Authors: A. Q. Chen, J. X. Wang

Abstract: The vector magnetic field characteristics of superactive regions (SARs) hold the key for understanding why SARs are extremely active and provide the guidance in space weather prediction. We aim to quantify the characteristics of SARs using the vector magnetograms taken by the Solar Magnetic Field Telescope at Huairou Solar Observatory Station. The vector magnetic field characteristics of 14 SARs i… ▽ More The vector magnetic field characteristics of superactive regions (SARs) hold the key for understanding why SARs are extremely active and provide the guidance in space weather prediction. We aim to quantify the characteristics of SARs using the vector magnetograms taken by the Solar Magnetic Field Telescope at Huairou Solar Observatory Station. The vector magnetic field characteristics of 14 SARs in solar cycles 22 and 23 were analyzed using the following four parameters: 1) the magnetic flux imbalance between opposite polarities, 2) the total photospheric free magnetic energy, 3) the length of the magnetic neutral line with its steep horizontal magnetic gradient, and 4) the area with strong magnetic shear. Furthermore, we selected another eight large and inactive active regions (ARs), which are called fallow ARs (FARs), to compare them with the SARs. We found that most of the SARs have a net magnetic flux higher than 7.0\times10^21 Mx, a total photospheric free magnetic energy higher than 1.0\times10^24 erg/cm, a magnetic neutral line with a steep horizontal magnetic gradient (\geq 300 G/Mm) longer than 30 Mm, and an area with strong magnetic shear (shear angle \geq 80\degree) greater than 100 Mm^2. In contrast, the values of these parameters for the FARs are mostly very low. The Pearson \c{hi}2 test was used to examine the significance of the difference between the SARs and FARs, and the results indicate that these two types of ARs can be fairly distinguished by each of these parameters. The significance levels are 99.55%, 99.98%, 99.98%, and 99.96%, respectively. However, no single parameter can distinguish them perfectly. Therefore we propose a composite index based on these parameters, and find that the distinction between the two types of ARs is also significant with a significance level of 99.96%. These results are useful for a better physical understanding of the SAR and FAR △ Less

Submitted 29 May, 2012; originally announced May 2012.

Comments: 9 pages, 3 figures, 2 tables

arXiv:1202.4518 [pdf, ps, other]

doi 10.1088/0004-637X/750/1/16

Revision of Solar Spicule Classification

Authors: Y. Z. Zhang, K. Shibata, J. X. Wang, X. J. Mao, T. Matsumoto, Y. Liu, J. T. Su

Abstract: Solar spicules are the fundamental magnetic structures in the chromosphere and considered to play a key role in channelling the chromosphere and corona. Recently, it was suggested by De Pontieu et al. that there were two types of spicules with very different dynamic properties, which were detected by space- time plot technique in the Ca ii H line (3968 A) wavelength from Hinode/SOT observations. '… ▽ More Solar spicules are the fundamental magnetic structures in the chromosphere and considered to play a key role in channelling the chromosphere and corona. Recently, it was suggested by De Pontieu et al. that there were two types of spicules with very different dynamic properties, which were detected by space- time plot technique in the Ca ii H line (3968 A) wavelength from Hinode/SOT observations. 'Type I' spicule, with a 3-7 minute lifetime, undergoes a cycle of upward and downward motion; in contrast, 'Type II' spicule fades away within dozens of seconds, without descending phase. We are motivated by the fact that for a spicule with complicated 3D motion, the space-time plot, which is made through a slit on a fixed position, could not match the spicule behavior all the time and might lose its real life story. By revisiting the same data sets, we identify and trace 105 and 102 spicules in quiet sun (QS) and coronal hole (CH), respectively, and obtain their statistical dynamic properties. First, we have not found a single convincing example of 'Type II' spicules. Secondly, more than 60% of the identified spicules in each region show a complete cycle, i.e., majority spicules are 'Type I'. Thirdly, the lifetime of spicules in QS and CH are 148 s and 112 s, respectively, but there is no fundamental lifetime difference between the spicules in QS and CH reported earlier. Therefore, the suggestion of coronal heating by 'Type II' spicules should be taken with cautions. Subject headings: Sun: chromosphere Sun:transition region Sun:corona △ Less

Submitted 20 February, 2012; originally announced February 2012.

Comments: accepted by ApJ

Journal ref: The Astrophysical Journal, 750:16 (9pp), 2012 May 1

arXiv:1105.4545 [pdf, other]

doi 10.1016/j.nuclphysbps.2011.03.053

Quarkonium production in high energy proton-proton and proton-nucleus collisions

Authors: Z. Conesa del Valle, G. Corcella, F. Fleuret, E. G. Ferreiro, V. Kartvelishvili, B. Z. Kopeliovich, J. P. Lansberg, C. Lourenço, G. Martinez, V. Papadimitriou, H. Satz, E. Scomparin, T. Ullrich, O. Teryaev, R. Vogt, J. X. Wang

Abstract: We present a brief overview of the most relevant current issues related to quarkonium production in high energy proton-proton and proton-nucleus collisions along with some perspectives. After reviewing recent experimental and theoretical results on quarkonium production in pp and pA collisions, we discuss the emerging field of polarisation studies. Thereafter, we report on issues related to heavy-… ▽ More We present a brief overview of the most relevant current issues related to quarkonium production in high energy proton-proton and proton-nucleus collisions along with some perspectives. After reviewing recent experimental and theoretical results on quarkonium production in pp and pA collisions, we discuss the emerging field of polarisation studies. Thereafter, we report on issues related to heavy-quark production, both in pp and pA collisions, complemented by AA collisions. To put the work in a broader perspective, we emphasize the need for new observables to investigate quarkonium production mechanisms and reiterate the qualities that make quarkonia a unique tool for many investigations in particle and nuclear physics. △ Less

Submitted 23 May, 2011; originally announced May 2011.

Comments: Overview for the proceedings of QUARKONIUM 2010: Three Days Of Quarkonium Production in pp and pA Collisions, 29-31 July 2010, Palaiseau, France; 34 pages, 30 figures, Latex

Report number: USM-TH-285

Journal ref: Nuclear Physics B (Proceedings Supplements) 214 (2011) pp. 3-36

arXiv:1102.3728 [pdf, ps, other]

doi 10.1088/0004-637X/731/1/37

The Sun's small-scale magnetic elements in Solar Cycle 23

Authors: C. L. Jin, J. X. Wang, Q. Song, H. Zhao

Abstract: With the unique database from Michelson Doppler Imager aboard the Solar and Heliospheric Observatory in an interval embodying solar cycle 23, the cyclic behavior of solar small-scale magnetic elements is studied. More than 13 million small-scale magnetic elements are selected, and the following results are unclosed. (1) The quiet regions dominated the Sun's magnetic flux for about 8 years in the 1… ▽ More With the unique database from Michelson Doppler Imager aboard the Solar and Heliospheric Observatory in an interval embodying solar cycle 23, the cyclic behavior of solar small-scale magnetic elements is studied. More than 13 million small-scale magnetic elements are selected, and the following results are unclosed. (1) The quiet regions dominated the Sun's magnetic flux for about 8 years in the 12.25 year duration of Cycle 23. They contributed (0.94 - 1.44) $\times 10^{23}$ Mx flux to the Sun from the solar minimum to maximum. The monthly average magnetic flux of the quiet regions is 1.12 times that of active regions in the cycle. (2) The ratio of quiet region flux to that of the total Sun equally characterizes the course of a solar cycle. The 6-month running-average flux ratio of quiet region had been larger than 90.0% for 28 continuous months from July 2007 to October 2009, which characterizes very well the grand solar minima of Cycles 23-24. (3) From the small to large end of the flux spectrum, the variations of numbers and total flux of the network elements show no-correlation, anti-correlation, and correlation with sunspots, respectively. The anti-correlated elements, covering the flux of (2.9 - 32.0)$\times 10^{18}$ Mx, occupies 77.2% of total element number and 37.4% of quiet Sun flux. These results provide insight into reason for anti-correlated variations of small-scale magnetic activity during the solar cycle. △ Less

Submitted 17 February, 2011; originally announced February 2011.

Comments: 21 pages, 6 figures, Accepted by ApJ

arXiv:1102.3485 [pdf, ps, other]

Small-scale magnetic elements in Solar Cycle 23

Authors: C. L. Jin, J. X. Wang, Q. Song, H. Zhao

Abstract: With the unique database from Michelson Doppler Imager aboard the Solar and Heliospheric Observatory in an interval embodying solar cycle 23, the cyclic behavior of solar small-scale magnetic elements is studied. More than 13 million small-scale magnetic elements are selected, and the following results are unclosed. (1) The quiet regions dominated the Sun\textsf{'}s magnetic flux for about 8 years… ▽ More With the unique database from Michelson Doppler Imager aboard the Solar and Heliospheric Observatory in an interval embodying solar cycle 23, the cyclic behavior of solar small-scale magnetic elements is studied. More than 13 million small-scale magnetic elements are selected, and the following results are unclosed. (1) The quiet regions dominated the Sun\textsf{'}s magnetic flux for about 8 years in the 12.25 year duration of Cycle 23. They contributed (0.94 -- 1.44) $\times 10^{23}$ Mx flux to the Sun from the solar minimum to maximum. The monthly average magnetic flux of the quiet regions is 1.12 times that of active regions in the cycle. (2) The ratio of quiet region flux to that of the total Sun equally characterizes the course of a solar cycle. The 6-month running-average flux ratio of quiet region had been larger than 90.0% for 28 continuous months from July 2007 to October 2009, which characterizes very well the grand solar minima of Cycles 23-24. (3) From the small to large end of the flux spectrum, the variations of numbers and total flux of the network elements show no-correlation, anti-correlation, and correlation with sunspots, respectively. The anti-correlated elements, covering the flux of (2.9 - 32.0)$\times 10^{18}$ Mx, occupies 77.2% of total element number and 37.4% of quiet Sun flux. These results provide insight into reason for anti-correlated variations of small-scale magnetic activity during the solar cycle. △ Less

Submitted 16 February, 2011; originally announced February 2011.

Comments: 21 pages, 6 figures accepted by ApJ

arXiv:1008.1502 [pdf, ps, other]

doi 10.1088/0004-637X/722/1/96

XMM Observations of the Seyfert 2 Galaxy NGC 7590: the Nature of X-ray Absorption

Authors: X. W. Shu, T. Liu, J. X. Wang

Abstract: We present the analysis of three XMM observations of the Seyfert 2 galaxy NGC 7590. The source was found to have no X-ray absorption in the low spatial resolution ASCA data. The XMM observations provide a factor of 10 better spatial resolution than previous ASCA data. We find that the X-ray emission of NGC 7590 is dominated by an off-nuclear ULX and extended emission from the host galaxy. The nucl… ▽ More We present the analysis of three XMM observations of the Seyfert 2 galaxy NGC 7590. The source was found to have no X-ray absorption in the low spatial resolution ASCA data. The XMM observations provide a factor of 10 better spatial resolution than previous ASCA data. We find that the X-ray emission of NGC 7590 is dominated by an off-nuclear ULX and extended emission from the host galaxy. The nuclear X-ray emission is rather weak comparing with the host galaxy. Based on its very low X-ray luminosity as well as the small ratio between the 2-10 keV and the [O III] fluxes, we interpret NGC 7590 as Compton-thick rather than being an "unobscured" Seyfert 2 galaxy. Future higher resolution observations such as Chandra are crucial to shed light on the nature of NGC 7590 nucleus. △ Less

Submitted 20 September, 2010; v1 submitted 9 August, 2010; originally announced August 2010.

Comments: The Astrophysical Journal, Accepted (15 pages, 4 figures)

Journal ref: 2010, ApJ, 722, 96

arXiv:1005.3829 [pdf, ps, other]

doi 10.1088/0004-637X/718/1/52

X-ray properties of the z ~ 4.5 Lyman-alpha Emitters in the Chandra Deep Field South Region

Authors: Z. Y. Zheng, J. X. Wang, S. L. Finkelstein, S. Malhotra, J. E. Rhoads, K. D. Finkelstein

Abstract: We report the first X-ray detection of 113 Lyman-alpha emitters at redshift z ~ 4.5. Only one source (J033127.2-274247) is detected in the Extended Chandra Deep Field South (ECDF-S) X-ray data, and has been spectroscopically confirmed as a z = 4.48 quasar with $L_X = 4.2\times 10^{44}$ erg/s. The single detection gives a Lyman-alpha quasar density consistent with the X-ray luminosity function of q… ▽ More We report the first X-ray detection of 113 Lyman-alpha emitters at redshift z ~ 4.5. Only one source (J033127.2-274247) is detected in the Extended Chandra Deep Field South (ECDF-S) X-ray data, and has been spectroscopically confirmed as a z = 4.48 quasar with $L_X = 4.2\times 10^{44}$ erg/s. The single detection gives a Lyman-alpha quasar density consistent with the X-ray luminosity function of quasars. The coadded counts of 22 Lyman-alpha emitters (LAEs) in the central Chandra Deep Field South (CDF-S) region yields a S/N=2.4 (p=99.83%) detection at soft band, with an effective exposure time of ~36 Ms. Further analysis of the equivalent width (EW) distribution shows that all the signal comes from 12 LAE candidates with EW_rest < 400 Å, and 2 of them contribute about half of the signal. Following-up spectroscopic observations show that the two are a low-redshift emission line galaxy and a Lyman break galaxy at z = 4.4. Excluding these two and combined with ECDF-S data, we derive a 3-sigma upper limit on the average luminosity of $L_{0.5-2 keV}$ $<$ 2.4 $\times 10^{42}$ ergs/s for z ~ 4.5 LAEs. If the average X-ray emission is due to star formation, it corresponds to a star-formation rate (SFR) of < 180--530 M$_\sun$ per yr. We use this SFR_X as an upper limit of the unobscured SFR to constrain the escape fraction of Lyman-alpha photons, and find a lower limit of f_esc > 3-10%. However, our upper limit on the SFR_X is ~7 times larger than the upper limit on SFR_X on z ~ 3.1 LAEs in the same field, and at least 30 times higher than the SFR estimated from Lyman-alpha emission. From the average X-ray to Lyman-alpha line ratio, we estimate that fewer than 3.2% (6.3%) of our LAEs could be high redshift type 1 (type 2) AGNs, and those hidden AGNs likely show low rest frame EWs. △ Less

Submitted 25 May, 2010; v1 submitted 20 May, 2010; originally announced May 2010.

Comments: 12 pages, 5 figures, ApJ accepted

arXiv:1003.1790 [pdf, ps, other]

doi 10.1088/0067-0049/187/2/581

The Cores of the Fe K$α$ Lines in Active Galactic Nuclei: an Extended Chandra High Energy Grating Sample

Authors: X. W. Shu, T. Yaqoob, J. X. Wang

Abstract: We extend the study of the core of the Fe K$α$ emission line at \sim 6.4 keV in Seyfert galaxies reported in Yaqoob & Padmanabhan (2004) using a larger sample observed by the Chandra High Energy Grating (HEG). Whilst heavily obscured active galactic nuclei (AGNs) are excluded from the sample, these data offer some of the highest precision measurements of the peak energy of the Fe K$α$ line, and th… ▽ More We extend the study of the core of the Fe K$α$ emission line at \sim 6.4 keV in Seyfert galaxies reported in Yaqoob & Padmanabhan (2004) using a larger sample observed by the Chandra High Energy Grating (HEG). Whilst heavily obscured active galactic nuclei (AGNs) are excluded from the sample, these data offer some of the highest precision measurements of the peak energy of the Fe K$α$ line, and the highest spectral resolution measurements of the width of the core of the line in unobscured and moderately obscured ($N_{H}<10^{23} \ \rm cm^{-2}$) Seyfert galaxies to date. The Fe K$α$ line is detected in 33 sources, and its centroid energy is constrained in 32 sources. In 27 sources the statistical quality of the data is good enough to yield measurements of the FWHM. We find that the distribution in the line centroid energy is strongly peaked around the value for neutral Fe, with over 80% of the observations giving values in the range 6.38--6.43 keV. Including statistical errors, 30 out of 32 sources ($\sim 94%$) have a line centroid energy in the range 6.35--6.47 keV. The mean equivalent width, amongst the observations in which a non-zero lower limit could be measured, was $53 \pm 3eV. The mean FWHM from the subsample of 27 sources was $2060 \pm 230 \ \rm km \ s^{-1}$. The mean EW and FWHM are somewhat higher when multiple observations for a given source are averaged. From a comparison with the H$β$ optical emission-line widths (or, for one source, Br$α$), we find that there is no universal location of the Fe K$α$ line-emitting region relative to the optical BLR. We confirm the presence of the X-ray Baldwin effect, an anti-correlation between the Fe K$α$ line EW and X-ray continuum luminosity. The HEG data have enabled isolation of this effect to the narrow core of the Fe K$α$ line. △ Less

Submitted 9 March, 2010; originally announced March 2010.

Comments: 54 pages, 7 figures, and 4 tables, to appear in ApJ Supplement Series

arXiv:1003.1789 [pdf, ps, other]

doi 10.1088/0004-637X/713/2/1256

NGC 2992 in an X-ray high state observed by XMM: Response of the Relativistic Fe K$α$ Line to the Continuum

Authors: X. W. Shu, T. Yaqoob, K. D. Murphy, V. Braito, J. X. Wang, W. Zheng

Abstract: We present the analysis of an XMM observation of the Seyfert galaxy NGC 2992. The source was found in its highest level of X-ray activity yet detected, a factor $\sim 23.5$ higher in 2--10 keV flux than the historical minimum. NGC 2992 is known to exhibit X-ray flaring activity on timescales of days to weeks, and the XMM data provide at least factor of $\sim 3$ better spectral resolution in the Fe… ▽ More We present the analysis of an XMM observation of the Seyfert galaxy NGC 2992. The source was found in its highest level of X-ray activity yet detected, a factor $\sim 23.5$ higher in 2--10 keV flux than the historical minimum. NGC 2992 is known to exhibit X-ray flaring activity on timescales of days to weeks, and the XMM data provide at least factor of $\sim 3$ better spectral resolution in the Fe K band than any previously measured flaring X-ray state. We find that there is a broad feature in the \sim 5-7 keV band which could be interpreted as a relativistic Fe K$α$ emission line. Its flux appears to have increased in tandem with the 2--10 keV continuum when compared to a previous Suzaku observation when the continuum was a factor of $\sim 8$ lower than that during the XMM observation. The XMM data are consistent with the general picture that increased X-ray activity and corresponding changes in the Fe K$α$ line emission occur in the innermost regions of the putative accretion disk. This behavior contrasts with the behavior of other AGN in which the Fe K$α$ line does not respond to variability in the X-ray. △ Less

Submitted 9 March, 2010; originally announced March 2010.

Comments: 30 pages, 6 figures, Accepted to ApJ

arXiv:0807.4669 [pdf, ps, other]

doi 10.1086/592141

Relativistic Outflows in two quasars in the Chandra Deep Field South

Authors: Z. Y. Zheng, J. X. Wang

Abstract: In this paper, we provide new 1 Ms $Chandra$ ACIS spectra of two quasars in the Chandra Deep Field South (CDF-S), which were previously reported to show strong and extremely blueshifted X-ray emission/absorption line features in previous 1 Ms spectra, with outflowing bulk velocity $v\sim$0.65-0.84c. In the new 1 Ms spectra, the relativistic blueshifted line feature is solidly confirmed in CXO CD… ▽ More In this paper, we provide new 1 Ms $Chandra$ ACIS spectra of two quasars in the Chandra Deep Field South (CDF-S), which were previously reported to show strong and extremely blueshifted X-ray emission/absorption line features in previous 1 Ms spectra, with outflowing bulk velocity $v\sim$0.65-0.84c. In the new 1 Ms spectra, the relativistic blueshifted line feature is solidly confirmed in CXO CDFS J033225.3-274219 (CDFS 46, $z$ = 1.617), and marginally visible in CXO CDFS J033260.0-274748 (CDFS 11, $z$ = 2.579), probably due to the increased Chandra ACIS background in the new 1 Ms exposure. The new data rule out the possibility (though very tiny already based on the old 1Ms data) that the two sources were selected to be unusual due to noise spikes in the spectra. The only likely interpretation is extremely blueshifted iron absorption/emission line or absorption edge due to relativistic outflow. We find that the rest frame emission line center in CDFS 46 marginally decreased from 16.2 keV to 15.2 keV after 7 years. The line shift can be due to either decreasing outflowing velocity or lower ionization level. Including the two quasars reported in this paper, we collect from literature a total of 7 quasars showing blueshifted emission or absorption line feature with $v\geq0.4c$ in X-ray spectra, and discuss its connection to jet and/or BAL (broad absorption line) outflow. △ Less

Submitted 29 July, 2008; originally announced July 2008.

Comments: 16 pages, 4 figures, ApJ accepted

arXiv:0804.1288 [pdf, other]

doi 10.1016/j.cpc.2008.08.003

EtabFDC: an $η_b$ event generator in hadroproduction at LHC

Authors: Cong-Feng Qiao, Jian Wang, Jian Xiong Wang, Yangheng Zheng

Abstract: The EtabFDC is a matrix-element event generator package for $η_b$ hadroproduction at LHC. It generates $pp\toη_b X$ events for all possible parton-level $2\to2$ leading-order processes with three commonly used $η_b$ decay channels being implemented. The \pythia interface is used for parton-shower and hadronization to obtain the hadronic events. The FORTRAN codes of this package are generated by… ▽ More The EtabFDC is a matrix-element event generator package for $η_b$ hadroproduction at LHC. It generates $pp\toη_b X$ events for all possible parton-level $2\to2$ leading-order processes with three commonly used $η_b$ decay channels being implemented. The \pythia interface is used for parton-shower and hadronization to obtain the hadronic events. The FORTRAN codes of this package are generated by FDC (Feynman Diagram Calculation) system automatically. △ Less

Submitted 22 August, 2008; v1 submitted 8 April, 2008; originally announced April 2008.

Comments: 23 pages, 6 figures

Journal ref: Comput.Phys.Commun.180:61-68,2009

arXiv:0712.0352 [pdf, ps, other]

doi 10.1088/1009-9271/8/2/07

Are Seyfert 2 Galaxies without Polarized Broad Emission Lines More Obscured?

Authors: X. W. Shu, J. X. Wang, P. Jiang

Abstract: The new $XMM-Newton$ data of seven Seyfert 2 galaxies with optical spectropolarimetric observations are presented. The analysis of 0.5 -- 10 keV spectra shows that all four Seyfert 2 galaxies with polarized broad lines (PBLs) are absorbed with $N_{\rm H}<10^{24}$ cm$^{-2}$, while two of three Seyfert 2 galaxies without PBLs have evidence suggesting Compton-thick obscuration, supporting the concl… ▽ More The new $XMM-Newton$ data of seven Seyfert 2 galaxies with optical spectropolarimetric observations are presented. The analysis of 0.5 -- 10 keV spectra shows that all four Seyfert 2 galaxies with polarized broad lines (PBLs) are absorbed with $N_{\rm H}<10^{24}$ cm$^{-2}$, while two of three Seyfert 2 galaxies without PBLs have evidence suggesting Compton-thick obscuration, supporting the conclusion that Seyfert 2 galaxies without PBLs are more obscured than those with PBLs. Adding the measured obscuration indicators ($N_{\rm H}$, $T$ ratio, and Fe K$α$ line EW) of six luminous AGNs to our previous sample improves the significance level of the difference in absorption from 92.3% to 96.3% for $N_{\rm H}$, 99.1% to 99.4% for $T$ ratio, and 95.3% to 97.4% for Fe K$α$ line EW. The present results support and enhance the suggestions that the absence of PBLs in Seyfert 2 galaxies can be explained by larger viewing angle of line of sight to the putative dusty torus, which lead to the obscuration of broad-line scattering screen, as expected by the unification model. △ Less

Submitted 3 December, 2007; originally announced December 2007.

Comments: 9 pages with 3 figures, accepted by ChJAA

arXiv:0707.3239 [pdf, ps, other]

doi 10.1086/521809

Chandra X-ray Sources in the LALA Cetus Field

Authors: J. X. Wang, Z. Y. Zheng, S. Malhotra, S. L. Finkelstein, J. E. Rhoads, C. A. Norman, T. M. Heckman

Abstract: The 174 ks Chandra Advanced CCD Imaging Spectrometer exposure of the Large Area Lyman Alpha Survey (LALA) Cetus field is the second of the two deep Chandra images on LALA fields. In this paper we present the Chandra X-ray sources detected in the Cetus field, along with an analysis of X-ray source counts, stacked X-ray spectrum, and optical identifications. A total of 188 X-ray sources were detec… ▽ More The 174 ks Chandra Advanced CCD Imaging Spectrometer exposure of the Large Area Lyman Alpha Survey (LALA) Cetus field is the second of the two deep Chandra images on LALA fields. In this paper we present the Chandra X-ray sources detected in the Cetus field, along with an analysis of X-ray source counts, stacked X-ray spectrum, and optical identifications. A total of 188 X-ray sources were detected: 174 in the 0.5-7.0 keV band, 154 in the 0.5-2.0 keV band, and 113 in the 2.0-7.0 keV band. The X-ray source counts were derived and compared with LALA Bootes field (172 ks exposure). Interestingly, we find consistent hard band X-ray source density, but 36+-12% higher soft band X-ray source density in Cetus field. The weighted stacked spectrum of the detected X-ray sources can be fitted by a powerlaw with photon index Gamma = 1.55. Based on the weighted stacked spectrum, we find that the resolved fraction of the X-ray background drops from 72+-1% at 0.5-1.0 keV to 63+-4% at 6.0-8.0 keV. The unresolved spectrum can be fitted by a powerlaw over the range 0.5-7 keV, with a photon index Gamma = 1.22. We also present optical counterparts for 154 of the X-ray sources, down to a limiting magnitude of r' = 25.9 (Vega), using a deep r' band image obtained with the MMT. △ Less

Submitted 22 July, 2007; originally announced July 2007.

Comments: 21 pages, including 6 figures, 1 table, ApJ accepted

arXiv:0705.1021 [pdf]

doi 10.1038/nphys650

Satellite Observations of Separator Line Geometry of Three-Dimensional Magnetic Reconnection

Authors: C. J. Xiao, X. G. Wang, Z. Y. Pu, Z. W. Ma, H. Zhao, G. P. Zhou, J. X. Wang, M. G. Kivelson, S. Y. Fu, Z. X. Liu, Q. G. Zong, M. W. Dunlop, K-H. Glassmeier, E. Lucek, H. Reme, I. Dandouras, C. P. Escoubet

Abstract: Detection of a separator line that connects magnetic nulls and the determination of the dynamics and plasma environment of such a structure can improve our understanding of the three-dimensional (3D) magnetic reconnection process. However, this type of field and particle configuration has not been directly observed in space plasmas. Here we report the identification of a pair of nulls, the null-… ▽ More Detection of a separator line that connects magnetic nulls and the determination of the dynamics and plasma environment of such a structure can improve our understanding of the three-dimensional (3D) magnetic reconnection process. However, this type of field and particle configuration has not been directly observed in space plasmas. Here we report the identification of a pair of nulls, the null-null line that connects them, and associated fans and spines in the magnetotail of Earth using data from the four Cluster spacecraft. With di and de designating the ion and electron inertial lengths, respectively, the separation between the nulls is found to be ~0.7di and an associated oscillation is identified as a lower hybrid wave with wavelength ~ de. This in situ evidence of the full 3D reconnection geometry and associated dynamics provides an important step toward to establishing an observational framework of 3D reconnection. △ Less

Submitted 1 July, 2007; v1 submitted 7 May, 2007; originally announced May 2007.

Comments: 10 pages, 3 figures and 1 table

Journal ref: Nature Physics advance online publication, 24 June 2007

Showing 1–50 of 80 results for author: Wang, J X