Search | arXiv e-print repository

Truthfulness of Calibration Measures

Authors: Nika Haghtalab, Mingda Qiao, Kunhe Yang, Eric Zhao

Abstract: We initiate the study of the truthfulness of calibration measures in sequential prediction. A calibration measure is said to be truthful if the forecaster (approximately) minimizes the expected penalty by predicting the conditional expectation of the next outcome, given the prior distribution of outcomes. Truthfulness is an important property of calibration measures, ensuring that the forecaster i… ▽ More We initiate the study of the truthfulness of calibration measures in sequential prediction. A calibration measure is said to be truthful if the forecaster (approximately) minimizes the expected penalty by predicting the conditional expectation of the next outcome, given the prior distribution of outcomes. Truthfulness is an important property of calibration measures, ensuring that the forecaster is not incentivized to exploit the system with deliberate poor forecasts. This makes it an essential desideratum for calibration measures, alongside typical requirements, such as soundness and completeness. We conduct a taxonomy of existing calibration measures and their truthfulness. Perhaps surprisingly, we find that all of them are far from being truthful. That is, under existing calibration measures, there are simple distributions on which a polylogarithmic (or even zero) penalty is achievable, while truthful prediction leads to a polynomial penalty. Our main contribution is the introduction of a new calibration measure termed the Subsampled Smooth Calibration Error (SSCE) under which truthful prediction is optimal up to a constant multiplicative factor. △ Less

Submitted 18 July, 2024; originally announced July 2024.

arXiv:2406.07006 [pdf, other]

MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Few-shot RAW Image Denoising track on MIPI 2024. In total, 165 participants were successfully registered, and 7 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art erformance on Few-shot RAW Image Denoising. More details of this challenge and the link to the dataset can be found at https://mipichallenge.org/MIPI2024. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

arXiv:2405.18435 [pdf, other]

QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks. △ Less

Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

Comments: initial technical report

arXiv:2405.10246 [pdf, other]

A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts

Authors: Xinru Zhang, Ni Ou, Berke Doga Basaran, Marco Visentin, Mengyun Qiao, Renyang Gu, Cheng Ouyang, Yaou Liu, Paul M. Matthew, Chuyang Ye, Wenjia Bai

Abstract: Brain lesion segmentation plays an essential role in neurological research and diagnosis. As brain lesions can be caused by various pathological alterations, different types of brain lesions tend to manifest with different characteristics on different imaging modalities. Due to this complexity, brain lesion segmentation methods are often developed in a task-specific manner. A specific segmentation… ▽ More Brain lesion segmentation plays an essential role in neurological research and diagnosis. As brain lesions can be caused by various pathological alterations, different types of brain lesions tend to manifest with different characteristics on different imaging modalities. Due to this complexity, brain lesion segmentation methods are often developed in a task-specific manner. A specific segmentation model is developed for a particular lesion type and imaging modality. However, the use of task-specific models requires predetermination of the lesion type and imaging modality, which complicates their deployment in real-world scenarios. In this work, we propose a universal foundation model for 3D brain lesion segmentation, which can automatically segment different types of brain lesions for input data of various imaging modalities. We formulate a novel Mixture of Modality Experts (MoME) framework with multiple expert networks attending to different imaging modalities. A hierarchical gating network combines the expert predictions and fosters expertise collaboration. Furthermore, we introduce a curriculum learning strategy during training to avoid the degeneration of each expert network and preserve their specialization. We evaluated the proposed method on nine brain lesion datasets, encompassing five imaging modalities and eight lesion types. The results show that our model outperforms state-of-the-art universal models and provides promising generalization to unseen datasets. △ Less

Submitted 16 July, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: The work has been early accepted by MICCAI 2024

arXiv:2404.11100 [pdf, other]

Synthesizing Realistic Data for Table Recognition

Authors: Qiyu Hou, Jun Wang, Meixuan Qiao, Lujun Tian

Abstract: To overcome the limitations and challenges of current automatic table data annotation methods and random table data synthesis approaches, we propose a novel method for synthesizing annotation data specifically designed for table recognition. This method utilizes the structure and content of existing complex tables, facilitating the efficient creation of tables that closely replicate the authentic… ▽ More To overcome the limitations and challenges of current automatic table data annotation methods and random table data synthesis approaches, we propose a novel method for synthesizing annotation data specifically designed for table recognition. This method utilizes the structure and content of existing complex tables, facilitating the efficient creation of tables that closely replicate the authentic styles found in the target domain. By leveraging the actual structure and content of tables from Chinese financial announcements, we have developed the first extensive table annotation dataset in this domain. We used this dataset to train several recent deep learning-based end-to-end table recognition models. Additionally, we have established the inaugural benchmark for real-world complex tables in the Chinese financial announcement domain, using it to assess the performance of models trained on our synthetic data, thereby effectively validating our method's practicality and effectiveness. Furthermore, we applied our synthesis method to augment the FinTabNet dataset, extracted from English financial announcements, by increasing the proportion of tables with multiple spanning cells to introduce greater complexity. Our experiments show that models trained on this augmented dataset achieve comprehensive improvements in performance, especially in the recognition of tables with multiple spanning cells. △ Less

Submitted 9 July, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: ICDAR 2024

arXiv:2403.05775 [pdf, other]

Scalable $k$-clique Densest Subgraph Search

Authors: Xiaowei Ye, Miao Qiao, Rong-Hua Li, Qi Zhang, Guoren Wang

Abstract: In this paper, we present a collection of novel and scalable algorithms designed to tackle the challenges inherent in the $k$-clique densest subgraph problem (\kcdsp) within network analysis. We propose \psctl, a novel algorithm based on the Frank-Wolfe approach for addressing \kcdsp, effectively solving a distinct convex programming problem. \textcolor{black}{\psctl is able to approximate \kcdsp… ▽ More In this paper, we present a collection of novel and scalable algorithms designed to tackle the challenges inherent in the $k$-clique densest subgraph problem (\kcdsp) within network analysis. We propose \psctl, a novel algorithm based on the Frank-Wolfe approach for addressing \kcdsp, effectively solving a distinct convex programming problem. \textcolor{black}{\psctl is able to approximate \kcdsp with near optimal guarantees.} The notable advantage of \psctl lies in its time complexity, which is independent of the count of $k$-cliques, resulting in remarkable efficiency in practical applications. Additionally, we present \spath, a sampling-based algorithm with the capability to handle networks on an unprecedented scale, reaching up to $1.8\times 10^9$ edges. By leveraging the \ccpath algorithm as a uniform $k$-clique sampler, \spath ensures the efficient processing of large-scale network data, accompanied by a detailed analysis of accuracy guarantees. Together, these contributions represent a significant advancement in the field of $k$-clique densest subgraph discovery. In experimental evaluations, our algorithms demonstrate orders of magnitude faster performance compared to the current state-of-the-art solutions. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2402.15169 [pdf, ps, other]

Platforms for Efficient and Incentive-Aware Collaboration

Authors: Nika Haghtalab, Mingda Qiao, Kunhe Yang

Abstract: Collaboration is crucial for reaching collective goals. However, its effectiveness is often undermined by the strategic behavior of individual agents -- a fact that is captured by a high Price of Stability (PoS) in recent literature [Blum et al., 2021]. Implicit in the traditional PoS analysis is the assumption that agents have full knowledge of how their tasks relate to one another. We offer a ne… ▽ More Collaboration is crucial for reaching collective goals. However, its effectiveness is often undermined by the strategic behavior of individual agents -- a fact that is captured by a high Price of Stability (PoS) in recent literature [Blum et al., 2021]. Implicit in the traditional PoS analysis is the assumption that agents have full knowledge of how their tasks relate to one another. We offer a new perspective on bringing about efficient collaboration among strategic agents using information design. Inspired by the growing importance of collaboration in machine learning (such as platforms for collaborative federated learning and data cooperatives), we propose a framework where the platform has more information about how the agents' tasks relate to each other than the agents themselves. We characterize how and to what degree such platforms can leverage their information advantage to steer strategic agents toward efficient collaboration. Concretely, we consider collaboration networks where each node is a task type held by one agent, and each task benefits from contributions made in their inclusive neighborhood of tasks. This network structure is known to the agents and the platform, but only the platform knows each agent's real location -- from the agents' perspective, their location is determined by a random permutation. We employ private Bayesian persuasion and design two families of persuasive signaling schemes that the platform can use to ensure a small total workload when agents follow the signal. The first family aims to achieve the minmax optimal approximation ratio compared to the optimal collaboration, which is shown to be $Θ(\sqrt{n})$ for unit-weight graphs, $Θ(n^{2/3})$ for graphs with constant minimum edge weights, and $O(n^{3/4})$ for general weighted graphs. The second family ensures per-instance strict improvement compared to full information disclosure. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.10445 [pdf, other]

Collaborative Learning with Different Labeling Functions

Authors: Yuyang Deng, Mingda Qiao

Abstract: We study a variant of Collaborative PAC Learning, in which we aim to learn an accurate classifier for each of the $n$ data distributions, while minimizing the number of samples drawn from them in total. Unlike in the usual collaborative learning setup, it is not assumed that there exists a single classifier that is simultaneously accurate for all distributions. We show that, when the data distri… ▽ More We study a variant of Collaborative PAC Learning, in which we aim to learn an accurate classifier for each of the $n$ data distributions, while minimizing the number of samples drawn from them in total. Unlike in the usual collaborative learning setup, it is not assumed that there exists a single classifier that is simultaneously accurate for all distributions. We show that, when the data distributions satisfy a weaker realizability assumption, which appeared in [Crammer and Mansour, 2012] in the context of multi-task learning, sample-efficient learning is still feasible. We give a learning algorithm based on Empirical Risk Minimization (ERM) on a natural augmentation of the hypothesis class, and the analysis relies on an upper bound on the VC dimension of this augmented class. In terms of the computational efficiency, we show that ERM on the augmented hypothesis class is NP-hard, which gives evidence against the existence of computationally efficient learners in general. On the positive side, for two special cases, we give learners that are both sample- and computationally-efficient. △ Less

Submitted 22 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: To appear at ICML 2024; v2 and v3 included additional discussion on related work

arXiv:2402.07458 [pdf, other]

On the Distance from Calibration in Sequential Prediction

Authors: Mingda Qiao, Letian Zheng

Abstract: We study a sequential binary prediction setting where the forecaster is evaluated in terms of the calibration distance, which is defined as the $L_1$ distance between the predicted values and the set of predictions that are perfectly calibrated in hindsight. This is analogous to a calibration measure recently proposed by Błasiok, Gopalan, Hu and Nakkiran (STOC 2023) for the offline setting. The ca… ▽ More We study a sequential binary prediction setting where the forecaster is evaluated in terms of the calibration distance, which is defined as the $L_1$ distance between the predicted values and the set of predictions that are perfectly calibrated in hindsight. This is analogous to a calibration measure recently proposed by Błasiok, Gopalan, Hu and Nakkiran (STOC 2023) for the offline setting. The calibration distance is a natural and intuitive measure of deviation from perfect calibration, and satisfies a Lipschitz continuity property which does not hold for many popular calibration measures, such as the $L_1$ calibration error and its variants. We prove that there is a forecasting algorithm that achieves an $O(\sqrt{T})$ calibration distance in expectation on an adversarially chosen sequence of $T$ binary outcomes. At the core of this upper bound is a structural result showing that the calibration distance is accurately approximated by the lower calibration distance, which is a continuous relaxation of the former. We then show that an $O(\sqrt{T})$ lower calibration distance can be achieved via a simple minimax argument and a reduction to online learning on a Lipschitz class. On the lower bound side, an $Ω(T^{1/3})$ calibration distance is shown to be unavoidable, even when the adversary outputs a sequence of independent random bits, and has an additional ability to early stop (i.e., to stop producing random bits and output the same bit in the remaining steps). Interestingly, without this early stopping, the forecaster can achieve a much smaller calibration distance of $\mathrm{polylog}(T)$. △ Less

Submitted 27 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: To appear at COLT 2024; v2 fixed minor typos

arXiv:2401.02875 [pdf, other]

The Dust Attenuation Scaling Relation of Star-Forming Galaxies in the EAGLE Simulations

Authors: Man Qiao, Xian Zhong Zheng, Antonios Katsianis, Jianbo Qin, Zhizheng Pan, Wenhao Liu, Qing-Hua Tan, Fang Xia An, Dong Dong Shi, Zongfei Lü, Yuheng Zhang, Run Wen, Shuang Liu, Chao Yang

Abstract: Dust attenuation in star-forming galaxies (SFGs), as parameterized by the infrared excess (IRX $\equiv L_{\rm IR}/L_{\rm UV}$), is found to be tightly correlated with star formation rate (SFR), metallicity and galaxy size, following a universal IRX relation up to $z=3$. This scaling relation can provide a fundamental constraint for theoretical models to reconcile galaxy star formation, chemical en… ▽ More Dust attenuation in star-forming galaxies (SFGs), as parameterized by the infrared excess (IRX $\equiv L_{\rm IR}/L_{\rm UV}$), is found to be tightly correlated with star formation rate (SFR), metallicity and galaxy size, following a universal IRX relation up to $z=3$. This scaling relation can provide a fundamental constraint for theoretical models to reconcile galaxy star formation, chemical enrichment, and structural evolution across cosmic time. We attempt to reproduce the universal IRX relation over $0.1\leq z\leq 2.5$ using the EAGLE hydrodynamical simulations and examine sensitive parameters in determining galaxy dust attenuation. Our findings show that while the predicted universal IRX relation from EAGLE approximately aligns with observations at $z\leq 0.5$, noticeable disparities arise at different stellar masses and higher redshifts. Specifically, we investigate how modifying various galaxy parameters can affect the predicted universal IRX relation in comparison to the observed data. We demonstrate that the simulated gas-phase metallicity is the critical quantity for the shape of the predicted universal IRX relation. We find that the influence of the infrared luminosity and infrared excess is less important while galaxy size has virtually no significant effect. Overall, the EAGLE simulations are not able to replicate some of the observed characteristics between IRX and galaxy parameters of SFGs, emphasizing the need for further investigation and testing for our current state-of-the-art theoretical models. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: 19 pages, 15 figures, accepted for publication in MNRAS

arXiv:2312.16700 [pdf, other]

doi 10.1093/mnras/stad3999

Understanding the Universal Dust Attenuation Scaling Relation of Star-Forming Galaxies

Authors: J. Qin, X. Z. Zheng, S. Wuyts, Z. Lv, M. Qiao, J. -S. Huang, F. S. Liu, A. Katsianis, V. Gonzalez, F. Bian, H. Xu, Z. Pan, W. Liu, Q. -H. Tan, F. X. An, D. D. Shi, Y. Zhang, R. Wen, S. Liu, C. Yang

Abstract: Star-forming galaxies (SFGs) adhere to a surprisingly tight scaling relation of dust attenuation parameterized by the infrared excess (IRX=$L_{\rm IR}/L_{\rm UV}$), being jointly determined by the star formation rate (SFR), galaxy size ($R_{\rm e}$), metallicity ($Z$/Z$_\odot$) and axial ratio ($b/a$). We examine how these galaxy parameters determine the effective dust attenuation and give rise to… ▽ More Star-forming galaxies (SFGs) adhere to a surprisingly tight scaling relation of dust attenuation parameterized by the infrared excess (IRX=$L_{\rm IR}/L_{\rm UV}$), being jointly determined by the star formation rate (SFR), galaxy size ($R_{\rm e}$), metallicity ($Z$/Z$_\odot$) and axial ratio ($b/a$). We examine how these galaxy parameters determine the effective dust attenuation and give rise to the universal IRX relation, utilizing a simple two-component star-dust geometry model in which dust in the dense and diffuse interstellar medium (ISM) follows exponential mass density profiles, connected with but not necessarily identical to the stellar mass profiles. Meanwhile, empirical relations are adopted to link galaxy properties, including the gas--star formation relation, the dust-to-stellar size relation, as well as the dust-to-gas ratio versus metallicity relation. By fitting a large sample of local SFGs with the model, we obtain the best-fitting model parameters as a function of metallicity, showing that the two-component geometry model is able to successfully reproduce the dependence of IRX on SFR, $R_{\rm e}$, $b/a$ at given $Z$/Z$_\odot$, as well as the dependence of power-law indices on metallicity. Moreover, we also retrieve constraints on the model geometry parameters, including the optical depth of birth clouds (BCs), BC-to-total dust mass fraction, BC covering factor of UV-emitting stars, and star-to-total dust disc radius ratio, which all evolve with galaxy metallicity. Finally, a consistent picture of how the star-dust geometry in SFGs evolves with galaxy metallicity is discussed. △ Less

Submitted 30 January, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

Comments: 20 pages, 10 figures, published in MNRAS (2024, Volume 528, Issue 1, pp.658-675); A PHTHON package IRX_TAU_TOT is available at https://github.com/LvZF/irx_tau_tot/ to calculate the total dust optical depth of a galaxy with given metallicity and best-fitting geometry parameters

Journal ref: MNRAS, 528, 658 (2024)

arXiv:2311.16416 [pdf, other]

A Combinatorial Approach to Robust PCA

Authors: Weihao Kong, Mingda Qiao, Rajat Sen

Abstract: We study the problem of recovering Gaussian data under adversarial corruptions when the noises are low-rank and the corruptions are on the coordinate level. Concretely, we assume that the Gaussian noises lie in an unknown $k$-dimensional subspace $U \subseteq \mathbb{R}^d$, and $s$ randomly chosen coordinates of each data point fall into the control of an adversary. This setting models the scenari… ▽ More We study the problem of recovering Gaussian data under adversarial corruptions when the noises are low-rank and the corruptions are on the coordinate level. Concretely, we assume that the Gaussian noises lie in an unknown $k$-dimensional subspace $U \subseteq \mathbb{R}^d$, and $s$ randomly chosen coordinates of each data point fall into the control of an adversary. This setting models the scenario of learning from high-dimensional yet structured data that are transmitted through a highly-noisy channel, so that the data points are unlikely to be entirely clean. Our main result is an efficient algorithm that, when $ks^2 = O(d)$, recovers every single data point up to a nearly-optimal $\ell_1$ error of $\tilde O(ks/d)$ in expectation. At the core of our proof is a new analysis of the well-known Basis Pursuit (BP) method for recovering a sparse signal, which is known to succeed under additional assumptions (e.g., incoherence or the restricted isometry property) on the underlying subspace $U$. In contrast, we present a novel approach via studying a natural combinatorial problem and show that, over the randomness in the support of the sparse signal, a high-probability error bound is possible even if the subspace $U$ is arbitrary. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: To appear at ITCS 2024

arXiv:2309.16853 [pdf, other]

T1/T2 relaxation temporal modelling from accelerated acquisitions using a Latent Transformer

Authors: Fanwen Wang, Michael Tanzer, Mengyun Qiao, Wenjia Bai, Daniel Rueckert, Guang Yang, Sonia Nielles-Vallespin

Abstract: Quantitative cardiac magnetic resonance T1 and T2 mapping enable myocardial tissue characterisation but the lengthy scan times restrict their widespread clinical application. We propose a deep learning method that incorporates a time dependency Latent Transformer module to model relationships between parameterised time frames for improved reconstruction from undersampled data. The module, implemen… ▽ More Quantitative cardiac magnetic resonance T1 and T2 mapping enable myocardial tissue characterisation but the lengthy scan times restrict their widespread clinical application. We propose a deep learning method that incorporates a time dependency Latent Transformer module to model relationships between parameterised time frames for improved reconstruction from undersampled data. The module, implemented as a multi-resolution sequence-to-sequence transformer, is integrated into an encoder-decoder architecture to leverage the inherent temporal correlations in relaxation processes. The presented results for accelerated T1 and T2 mapping show the model recovers maps with higher fidelity by explicit incorporation of time dynamics. This work demonstrates the importance of temporal modelling for artifact-free reconstruction in quantitative MRI. △ Less

Submitted 28 September, 2023; originally announced September 2023.

arXiv:2308.09442 [pdf, other]

BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine

Authors: Yizhen Luo, Jiahuan Zhang, Siqi Fan, Kai Yang, Yushuai Wu, Mu Qiao, Zaiqing Nie

Abstract: Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cel… ▽ More Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cells, which are encoded by the language of life and exhibit significant modality gaps with human natural language. In this paper, we introduce BioMedGPT, an open multimodal generative pre-trained transformer (GPT) for biomedicine, to bridge the gap between the language of life and human natural language. BioMedGPT allows users to easily ``communicate'' with diverse biological modalities through free text, which is the first of its kind. BioMedGPT aligns different biological modalities with natural language via a large generative language model, namely, BioMedGPT-LM. We publish BioMedGPT-10B, which unifies the feature spaces of molecules, proteins, and natural language via encoding and alignment. Through fine-tuning, BioMedGPT-10B outperforms or is on par with human and significantly larger general-purpose foundation models on the biomedical QA task. It also demonstrates promising performance in the molecule QA and protein QA tasks, which could greatly accelerate the discovery of new drugs and therapeutic targets. In addition, BioMedGPT-LM-7B is the first large generative language model based on Llama2 in the biomedical domain, therefore is commercial friendly. Both BioMedGPT-10B and BioMedGPT-LM-7B are open-sourced to the research community. In addition, we publish the datasets that are meticulously curated for the alignment of multi-modalities, i.e., PubChemQA and UniProtQA. All the models, codes, and datasets are available at \url{https://github.com/PharMolix/OpenBioMed}. △ Less

Submitted 21 August, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

Comments: 12 pages, 4 figures

arXiv:2308.09026 [pdf, ps, other]

LesionMix: A Lesion-Level Data Augmentation Method for Medical Image Segmentation

Authors: Berke Doga Basaran, Weitong Zhang, Mengyun Qiao, Bernhard Kainz, Paul M. Matthews, Wenjia Bai

Abstract: Data augmentation has become a de facto component of deep learning-based medical image segmentation methods. Most data augmentation techniques used in medical imaging focus on spatial and intensity transformations to improve the diversity of training images. They are often designed at the image level, augmenting the full image, and do not pay attention to specific abnormalities within the image. H… ▽ More Data augmentation has become a de facto component of deep learning-based medical image segmentation methods. Most data augmentation techniques used in medical imaging focus on spatial and intensity transformations to improve the diversity of training images. They are often designed at the image level, augmenting the full image, and do not pay attention to specific abnormalities within the image. Here, we present LesionMix, a novel and simple lesion-aware data augmentation method. It performs augmentation at the lesion level, increasing the diversity of lesion shape, location, intensity and load distribution, and allowing both lesion populating and inpainting. Experiments on different modalities and different lesion datasets, including four brain MR lesion datasets and one liver CT lesion dataset, demonstrate that LesionMix achieves promising performance in lesion image segmentation, outperforming several recent Mix-based data augmentation methods. The code will be released at https://github.com/dogabasaran/lesionmix. △ Less

Submitted 17 August, 2023; originally announced August 2023.

Comments: 13 pages, 5 figures, 4 tables, MICCAI DALI Workshop 2023

arXiv:2307.12820 [pdf, other]

Diurnal modulation of electron recoils from DM-nucleon scattering through the Migdal effect

Authors: Mai Qiao, Chen Xia, Yu-Feng Zhou

Abstract: Halo dark matter (DM) particles could lose energy due to the scattering off nuclei within the Earth before reaching the underground detectors of DM direct detection experiments. This Earth shielding effect can result in diurnal modulation of the DM-induced recoil event rates observed underground due to the self-rotation of the Earth. For electron recoil signals from DM-electron scatterings, the cu… ▽ More Halo dark matter (DM) particles could lose energy due to the scattering off nuclei within the Earth before reaching the underground detectors of DM direct detection experiments. This Earth shielding effect can result in diurnal modulation of the DM-induced recoil event rates observed underground due to the self-rotation of the Earth. For electron recoil signals from DM-electron scatterings, the current experimental constraints are very stringent such that the diurnal modulation cannot be observed for halo DM. We propose a novel type of diurnal modulation effect: diurnal modulation in electron recoil signals induced by DM-nucleon scattering via the Migdal effect. We set so far the most stringent constraints on DM-nucleon scattering cross section via the Migdal effect for sub-GeV DM using the S2-only data of PandaX-II and PandaX-4T with improved simulations of the Earth shielding effect. Based on the updated constraints, we show that the Migdal effect induced diurnal modulation of electron events can still be significant in the low energy region, and can be probed by experiments such as PandaX-4T in the near future. △ Less

Submitted 1 November, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Comments: Comments on the Migdal effects added, figures and text improved, version accepted by JCAP

arXiv:2307.11133 [pdf, other]

Contrastive Graph Pooling for Explainable Classification of Brain Networks

Authors: Jiaxing Xu, Qingtian Bian, Xinhang Li, Aihu Zhang, Yiping Ke, Miao Qiao, Wei Zhang, Wei Khang Jeremy Sim, Balázs Gulyás

Abstract: Functional magnetic resonance imaging (fMRI) is a commonly used technique to measure neural activation. Its application has been particularly important in identifying underlying neurodegenerative conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis of fMRI data models the brain as a graph and extracts features by graph neural networks (GNNs). However, the unique characteristics… ▽ More Functional magnetic resonance imaging (fMRI) is a commonly used technique to measure neural activation. Its application has been particularly important in identifying underlying neurodegenerative conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis of fMRI data models the brain as a graph and extracts features by graph neural networks (GNNs). However, the unique characteristics of fMRI data require a special design of GNN. Tailoring GNN to generate effective and domain-explainable features remains challenging. In this paper, we propose a contrastive dual-attention block and a differentiable graph pooling method called ContrastPool to better utilize GNN for brain networks, meeting fMRI-specific requirements. We apply our method to 5 resting-state fMRI brain network datasets of 3 diseases and demonstrate its superiority over state-of-the-art baselines. Our case study confirms that the patterns extracted by our method match the domain knowledge in neuroscience literature, and disclose direct and interesting insights. Our contributions underscore the potential of ContrastPool for advancing the understanding of brain networks and neurodegenerative conditions. The source code is available at https://github.com/AngusMonroe/ContrastPool. △ Less

Submitted 12 April, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

arXiv:2307.08347 [pdf, other]

M-FLAG: Medical Vision-Language Pre-training with Frozen Language Models and Latent Space Geometry Optimization

Authors: Che Liu, Sibo Cheng, Chen Chen, Mengyun Qiao, Weitong Zhang, Anand Shah, Wenjia Bai, Rossella Arcucci

Abstract: Medical vision-language models enable co-learning and integrating features from medical imaging and clinical text. However, these models are not easy to train and the latent representation space can be complex. Here we propose a novel way for pre-training and regularising medical vision-language models. The proposed method, named Medical vision-language pre-training with Frozen language models and… ▽ More Medical vision-language models enable co-learning and integrating features from medical imaging and clinical text. However, these models are not easy to train and the latent representation space can be complex. Here we propose a novel way for pre-training and regularising medical vision-language models. The proposed method, named Medical vision-language pre-training with Frozen language models and Latent spAce Geometry optimization (M-FLAG), leverages a frozen language model for training stability and efficiency and introduces a novel orthogonality loss to harmonize the latent space geometry. We demonstrate the potential of the pre-trained model on three downstream tasks: medical image classification, segmentation, and object detection. Extensive experiments across five public datasets demonstrate that M-FLAG significantly outperforms existing medical vision-language pre-training approaches and reduces the number of parameters by 78\%. Notably, M-FLAG achieves outstanding performance on the segmentation task while using only 1\% of the RSNA dataset, even outperforming ImageNet pre-trained models that have been fine-tuned using 100\% of the data. △ Less

Submitted 19 July, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: Accepted by MICCAI 2023

arXiv:2304.13240 [pdf, other]

Structure Diagram Recognition in Financial Announcements

Authors: Meixuan Qiao, Jun Wang, Junfu Xiang, Qiyu Hou, Ruixuan Li

Abstract: Accurately extracting structured data from structure diagrams in financial announcements is of great practical importance for building financial knowledge graphs and further improving the efficiency of various financial applications. First, we proposed a new method for recognizing structure diagrams in financial announcements, which can better detect and extract different types of connecting lines… ▽ More Accurately extracting structured data from structure diagrams in financial announcements is of great practical importance for building financial knowledge graphs and further improving the efficiency of various financial applications. First, we proposed a new method for recognizing structure diagrams in financial announcements, which can better detect and extract different types of connecting lines, including straight lines, curves, and polylines of different orientations and angles. Second, we developed a two-stage method to efficiently generate the industry's first benchmark of structure diagrams from Chinese financial announcements, where a large number of diagrams were synthesized and annotated using an automated tool to train a preliminary recognition model with fairly good performance, and then a high-quality benchmark can be obtained by automatically annotating the real-world structure diagrams using the preliminary model and then making few manual corrections. Finally, we experimentally verified the significant performance advantage of our structure diagram recognition method over previous methods. △ Less

Submitted 1 May, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

Comments: ICDAR2023

arXiv:2303.12644 [pdf, other]

doi 10.1007/978-3-031-43999-5_14

Feature-Conditioned Cascaded Video Diffusion Models for Precise Echocardiogram Synthesis

Authors: Hadrien Reynaud, Mengyun Qiao, Mischa Dombrowski, Thomas Day, Reza Razavi, Alberto Gomez, Paul Leeson, Bernhard Kainz

Abstract: Image synthesis is expected to provide value for the translation of machine learning methods into clinical practice. Fundamental problems like model robustness, domain transfer, causal modelling, and operator training become approachable through synthetic data. Especially, heavily operator-dependant modalities like Ultrasound imaging require robust frameworks for image and video generation. So far… ▽ More Image synthesis is expected to provide value for the translation of machine learning methods into clinical practice. Fundamental problems like model robustness, domain transfer, causal modelling, and operator training become approachable through synthetic data. Especially, heavily operator-dependant modalities like Ultrasound imaging require robust frameworks for image and video generation. So far, video generation has only been possible by providing input data that is as rich as the output data, e.g., image sequence plus conditioning in, video out. However, clinical documentation is usually scarce and only single images are reported and stored, thus retrospective patient-specific analysis or the generation of rich training data becomes impossible with current approaches. In this paper, we extend elucidated diffusion models for video modelling to generate plausible video sequences from single images and arbitrary conditioning with clinical parameters. We explore this idea within the context of echocardiograms by looking into the variation of the Left Ventricle Ejection Fraction, the most essential clinical metric gained from these examinations. We use the publicly available EchoNet-Dynamic dataset for all our experiments. Our image to sequence approach achieves an $R^2$ score of 93%, which is 38 points higher than recently proposed sequence to sequence generation methods. Code and models will be available at: https://github.com/HReynaud/EchoDiffusion. △ Less

Submitted 21 February, 2024; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: Published in MICCAI 2023 proceedings. https://link.springer.com/chapter/10.1007/978-3-031-43999-5_14

arXiv:2303.11376 [pdf, other]

GNN-Ensemble: Towards Random Decision Graph Neural Networks

Authors: Wenqi Wei, Mu Qiao, Divyesh Jadav

Abstract: Graph Neural Networks (GNNs) have enjoyed wide spread applications in graph-structured data. However, existing graph based applications commonly lack annotated data. GNNs are required to learn latent patterns from a limited amount of training data to perform inferences on a vast amount of test data. The increased complexity of GNNs, as well as a single point of model parameter initialization, usua… ▽ More Graph Neural Networks (GNNs) have enjoyed wide spread applications in graph-structured data. However, existing graph based applications commonly lack annotated data. GNNs are required to learn latent patterns from a limited amount of training data to perform inferences on a vast amount of test data. The increased complexity of GNNs, as well as a single point of model parameter initialization, usually lead to overfitting and sub-optimal performance. In addition, it is known that GNNs are vulnerable to adversarial attacks. In this paper, we push one step forward on the ensemble learning of GNNs with improved accuracy, generalization, and adversarial robustness. Following the principles of stochastic modeling, we propose a new method called GNN-Ensemble to construct an ensemble of random decision graph neural networks whose capacity can be arbitrarily expanded for improvement in performance. The essence of the method is to build multiple GNNs in randomly selected substructures in the topological space and subfeatures in the feature space, and then combine them for final decision making. These GNNs in different substructure and subfeature spaces generalize their classification in complementary ways. Consequently, their combined classification performance can be improved and overfitting on the training data can be effectively reduced. In the meantime, we show that GNN-Ensemble can significantly improve the adversarial robustness against attacks on GNNs. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2302.10436 [pdf, other]

doi 10.1038/s41534-023-00784-8

Error-Mitigated Quantum Simulation of Interacting Fermions with Trapped Ions

Authors: Wentao Chen, Shuaining Zhang, Jialiang Zhang, Xiaolu Su, Yao Lu, Kuan Zhang, Mu Qiao, Ying Li, Jing-Ning Zhang, Kihwan Kim

Abstract: Quantum error mitigation has been extensively explored to increase the accuracy of the quantum circuits in noisy-intermediate-scale-quantum (NISQ) computation, where quantum error correction requiring additional quantum resources is not adopted. Among various error-mitigation schemes, probabilistic error cancellation (PEC) has been proposed as a general and systematic protocol that can be applied… ▽ More Quantum error mitigation has been extensively explored to increase the accuracy of the quantum circuits in noisy-intermediate-scale-quantum (NISQ) computation, where quantum error correction requiring additional quantum resources is not adopted. Among various error-mitigation schemes, probabilistic error cancellation (PEC) has been proposed as a general and systematic protocol that can be applied to numerous hardware platforms and quantum algorithms. However, PEC has only been tested in two-qubit systems and a superconducting multi-qubit system by learning a sparse error model. Here, we benchmark PEC using up to four trapped-ion qubits. For the benchmark, we simulate the dynamics of interacting fermions with or without spins by applying multiple Trotter steps. By tomographically reconstructing the error model and incorporating other mitigation methods such as positive probability and symmetry constraints, we are able to increase the fidelity of simulation and faithfully observe the dynamics of the Fermi-Hubbard model, including the different behavior of charge and spin of fermions. Our demonstrations can be an essential step for further extending systematic error-mitigation schemes toward practical quantum advantages. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: 15 pages, 11 figures

Journal ref: npj Quantum Information 9, 122 (2023)

arXiv:2301.13098 [pdf, other]

doi 10.1109/TMI.2023.3331982

CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac Anatomy

Authors: Mengyun Qiao, Shuo Wang, Huaqi Qiu, Antonio de Marvao, Declan P. O'Regan, Daniel Rueckert, Wenjia Bai

Abstract: Two key questions in cardiac image analysis are to assess the anatomy and motion of the heart from images; and to understand how they are associated with non-imaging clinical factors such as gender, age and diseases. While the first question can often be addressed by image segmentation and motion tracking algorithms, our capability to model and to answer the second question is still limited. In th… ▽ More Two key questions in cardiac image analysis are to assess the anatomy and motion of the heart from images; and to understand how they are associated with non-imaging clinical factors such as gender, age and diseases. While the first question can often be addressed by image segmentation and motion tracking algorithms, our capability to model and to answer the second question is still limited. In this work, we propose a novel conditional generative model to describe the 4D spatio-temporal anatomy of the heart and its interaction with non-imaging clinical factors. The clinical factors are integrated as the conditions of the generative modelling, which allows us to investigate how these factors influence the cardiac anatomy. We evaluate the model performance in mainly two tasks, anatomical sequence completion and sequence generation. The model achieves a high performance in anatomical sequence completion, comparable to or outperforming other state-of-the-art generative models. In terms of sequence generation, given clinical conditions, the model can generate realistic synthetic 4D sequential anatomies that share similar distributions with the real data. △ Less

Submitted 30 November, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: Accepted by IEEE Transactions on Medical Imaging

arXiv:2212.02055 [pdf, other]

Graph Convolutional Neural Networks with Diverse Negative Samples via Decomposed Determinant Point Processes

Authors: Wei Duan, Junyu Xuan, Maoying Qiao, Jie Lu

Abstract: Graph convolutional networks (GCNs) have achieved great success in graph representation learning by extracting high-level features from nodes and their topology. Since GCNs generally follow a message-passing mechanism, each node aggregates information from its first-order neighbour to update its representation. As a result, the representations of nodes with edges between them should be positively… ▽ More Graph convolutional networks (GCNs) have achieved great success in graph representation learning by extracting high-level features from nodes and their topology. Since GCNs generally follow a message-passing mechanism, each node aggregates information from its first-order neighbour to update its representation. As a result, the representations of nodes with edges between them should be positively correlated and thus can be considered positive samples. However, there are more non-neighbour nodes in the whole graph, which provide diverse and useful information for the representation update. Two non-adjacent nodes usually have different representations, which can be seen as negative samples. Besides the node representations, the structural information of the graph is also crucial for learning. In this paper, we used quality-diversity decomposition in determinant point processes (DPP) to obtain diverse negative samples. When defining a distribution on diverse subsets of all non-neighbouring nodes, we incorporate both graph structure information and node representations. Since the DPP sampling process requires matrix eigenvalue decomposition, we propose a new shortest-path-base method to improve computational efficiency. Finally, we incorporate the obtained negative samples into the graph convolution operation. The ideas are evaluated empirically in experiments on node classification tasks. These experiments show that the newly proposed methods not only improve the overall performance of standard representation learning but also significantly alleviate over-smoothing problems. △ Less

Submitted 6 September, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

Comments: Accepted by IEEE TNNLS on 30-Aug-2023. arXiv admin note: text overlap with arXiv:2210.00728

arXiv:2211.12421 [pdf, other]

Data-Driven Network Neuroscience: On Data Collection and Benchmark

Authors: Jiaxing Xu, Yunhan Yang, David Tse Jung Huang, Sophi Shilpa Gururajapathy, Yiping Ke, Miao Qiao, Alan Wang, Haribalan Kumar, Josh McGeown, Eryn Kwon

Abstract: This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such… ▽ More This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such as Alzheimer's, Parkinson's, and Autism. Recently, the study of the brain in the form of brain networks using machine learning and graph analytics has become increasingly popular, especially to predict the early onset of these conditions. A brain network, represented as a graph, retains rich structural and positional information that traditional examination methods are unable to capture. However, the lack of publicly accessible brain network data prevents researchers from data-driven explorations. One of the main difficulties lies in the complicated domain-specific preprocessing steps and the exhaustive computation required to convert the data from MRI images into brain networks. We bridge this gap by collecting a large amount of MRI images from public databases and a private source, working with domain experts to make sensible design choices, and preprocessing the MRI images to produce a collection of brain network datasets. The datasets originate from 6 different sources, cover 4 brain conditions, and consist of a total of 2,702 subjects. We test our graph datasets on 12 machine learning models to provide baselines and validate the data quality on a recent graph analysis model. To lower the barrier to entry and promote the research in this interdisciplinary field, we release our brain network data and complete preprocessing details including codes at https://doi.org/10.17608/k6.auckland.21397377 and https://github.com/brainnetuoa/data_driven_network_neuroscience. △ Less

Submitted 29 October, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

Journal ref: Advances in Neural Information Processing Systems, 2023

arXiv:2210.16041 [pdf, other]

Centralization Problem for Opinion Convergence in Decentralized Networks

Authors: Yiping Liu, Jiamou Liu, Bakhadyr Khoussaino, Miao Qiao, Bo Yan

Abstract: This paper aims to provide a new perspective on the interplay between decentralization -- a prevalent character of multi-agent systems -- and centralization, i.e., the task of imposing central control to meet system-level goals. In particular, in the context of networked opinion dynamic model, the paper proposes and discusses a framework for centralization. More precisely, a decentralized network… ▽ More This paper aims to provide a new perspective on the interplay between decentralization -- a prevalent character of multi-agent systems -- and centralization, i.e., the task of imposing central control to meet system-level goals. In particular, in the context of networked opinion dynamic model, the paper proposes and discusses a framework for centralization. More precisely, a decentralized network consists of autonomous agents and their social structure that is unknown and dynamic. Centralization is a process of appointing agents in the network to act as access units who provide information and exert influence over their local surroundings. We discuss centralization for the DeGroot model of opinion dynamics, aiming to enforce opinion convergence using the minimum number of access units. We show that the key to the centralization process lies in selecting access units so that they form a dominating set. We then propose algorithms under a new local algorithmic framework, namely prowling, to accomplish this task. To validate our algorithm, we perform systematic experiments over both real-world and synthetic networks and verify that our algorithm outperforms benchmarks. △ Less

Submitted 28 October, 2022; originally announced October 2022.

arXiv:2210.02415 [pdf, other]

A Fourier Approach to Mixture Learning

Authors: Mingda Qiao, Guru Guruganesh, Ankit Singh Rawat, Avinava Dubey, Manzil Zaheer

Abstract: We revisit the problem of learning mixtures of spherical Gaussians. Given samples from mixture $\frac{1}{k}\sum_{j=1}^{k}\mathcal{N}(μ_j, I_d)$, the goal is to estimate the means $μ_1, μ_2, \ldots, μ_k \in \mathbb{R}^d$ up to a small error. The hardness of this learning problem can be measured by the separation $Δ$ defined as the minimum distance between all pairs of means. Regev and Vijayaraghava… ▽ More We revisit the problem of learning mixtures of spherical Gaussians. Given samples from mixture $\frac{1}{k}\sum_{j=1}^{k}\mathcal{N}(μ_j, I_d)$, the goal is to estimate the means $μ_1, μ_2, \ldots, μ_k \in \mathbb{R}^d$ up to a small error. The hardness of this learning problem can be measured by the separation $Δ$ defined as the minimum distance between all pairs of means. Regev and Vijayaraghavan (2017) showed that with $Δ= Ω(\sqrt{\log k})$ separation, the means can be learned using $\mathrm{poly}(k, d)$ samples, whereas super-polynomially many samples are required if $Δ= o(\sqrt{\log k})$ and $d = Ω(\log k)$. This leaves open the low-dimensional regime where $d = o(\log k)$. In this work, we give an algorithm that efficiently learns the means in $d = O(\log k/\log\log k)$ dimensions under separation $d/\sqrt{\log k}$ (modulo doubly logarithmic factors). This separation is strictly smaller than $\sqrt{\log k}$, and is also shown to be necessary. Along with the results of Regev and Vijayaraghavan (2017), our work almost pins down the critical separation threshold at which efficient parameter learning becomes possible for spherical Gaussian mixtures. More generally, our algorithm runs in time $\mathrm{poly}(k)\cdot f(d, Δ, ε)$, and is thus fixed-parameter tractable in parameters $d$, $Δ$ and $ε$. Our approach is based on estimating the Fourier transform of the mixture at carefully chosen frequencies, and both the algorithm and its analysis are simple and elementary. Our positive results can be easily extended to learning mixtures of non-Gaussian distributions, under a mild condition on the Fourier spectrum of the distribution. △ Less

Submitted 5 October, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

Comments: To appear at NeurIPS 2022; v2 corrected author information

arXiv:2210.00728 [pdf, other]

doi 10.1609/aaai.v36i6.20608

Learning from the Dark: Boosting Graph Convolutional Neural Networks with Diverse Negative Samples

Authors: Wei Duan, Junyu Xuan, Maoying Qiao, Jie Lu

Abstract: Graph Convolutional Neural Networks (GCNs) has been generally accepted to be an effective tool for node representations learning. An interesting way to understand GCNs is to think of them as a message passing mechanism where each node updates its representation by accepting information from its neighbours (also known as positive samples). However, beyond these neighbouring nodes, graphs have a lar… ▽ More Graph Convolutional Neural Networks (GCNs) has been generally accepted to be an effective tool for node representations learning. An interesting way to understand GCNs is to think of them as a message passing mechanism where each node updates its representation by accepting information from its neighbours (also known as positive samples). However, beyond these neighbouring nodes, graphs have a large, dark, all-but forgotten world in which we find the non-neighbouring nodes (negative samples). In this paper, we show that this great dark world holds a substantial amount of information that might be useful for representation learning. Most specifically, it can provide negative information about the node representations. Our overall idea is to select appropriate negative samples for each node and incorporate the negative information contained in these samples into the representation updates. Moreover, we show that the process of selecting the negative samples is not trivial. Our theme therefore begins by describing the criteria for a good negative sample, followed by a determinantal point process algorithm for efficiently obtaining such samples. A GCN, boosted by diverse negative samples, then jointly considers the positive and negative information when passing messages. Experimental evaluations show that this idea not only improves the overall performance of standard representation learning but also significantly alleviates over-smoothing problems. △ Less

Submitted 3 October, 2022; originally announced October 2022.

arXiv:2210.00655 [pdf, other]

Online Pen Testing

Authors: Mingda Qiao, Gregory Valiant

Abstract: We study a "pen testing" problem, in which we are given $n$ pens with unknown amounts of ink $X_1, X_2, \ldots, X_n$, and we want to choose a pen with the maximum amount of remaining ink in it. The challenge is that we cannot access each $X_i$ directly; we only get to write with the $i$-th pen until either a certain amount of ink is used, or the pen runs out of ink. In both cases, this testing red… ▽ More We study a "pen testing" problem, in which we are given $n$ pens with unknown amounts of ink $X_1, X_2, \ldots, X_n$, and we want to choose a pen with the maximum amount of remaining ink in it. The challenge is that we cannot access each $X_i$ directly; we only get to write with the $i$-th pen until either a certain amount of ink is used, or the pen runs out of ink. In both cases, this testing reduces the remaining ink in the pen and thus the utility of selecting it. Despite this significant lack of information, we show that it is possible to approximately maximize our utility up to an $O(\log n)$ factor. Formally, we consider two different setups: the "prophet" setting, in which each $X_i$ is independently drawn from some distribution $\mathcal{D}_i$, and the "secretary" setting, in which $(X_i)_{i=1}^n$ is a random permutation of arbitrary $a_1, a_2, \ldots, a_n$. We derive the optimal competitive ratios in both settings up to constant factors. Our algorithms are surprisingly robust: (1) In the prophet setting, we only require one sample from each $\mathcal{D}_i$, rather than a full description of the distribution; (2) In the secretary setting, the algorithm also succeeds under an arbitrary permutation, if an estimate of the maximum $a_i$ is given. Our techniques include a non-trivial online sampling scheme from a sequence with an unknown length, as well as the construction of a hard, non-uniform distribution over permutations. Both might be of independent interest. We also highlight some immediate open problems and discuss several directions for future research. △ Less

Submitted 21 November, 2022; v1 submitted 2 October, 2022; originally announced October 2022.

Comments: To appear at ITCS 2023; v2 added discussion on a closely related work of Awerbuch, Azar, Fiat, and Leighton (1996)

arXiv:2208.13951 [pdf, other]

doi 10.1109/JLT.2023.3235048

CD and PMD Effect on Cyclostationarity-Based Timing Recovery for Optical Coherent Receivers

Authors: Dawei Wang, Meng Qiao, Kunjian Lian, Zhaohui Li

Abstract: Timing recovery is critical for synchronizing the clocks at the transmitting and receiving ends of a digital coherent communication system. The core of timing recovery is to determine reliably the current sampling error of the local digitizer so that the timing circuit may lock to a stable operation point. Conventional timing phase detectors need to adapt to the optical fiber channel so that the c… ▽ More Timing recovery is critical for synchronizing the clocks at the transmitting and receiving ends of a digital coherent communication system. The core of timing recovery is to determine reliably the current sampling error of the local digitizer so that the timing circuit may lock to a stable operation point. Conventional timing phase detectors need to adapt to the optical fiber channel so that the common effects of this channel, such as chromatic dispersion (CD) and polarization mode dispersion (PMD), on the timing phase extraction must be understood. Here we exploit the cyclostationarity of the optical signal and derive a model for studying the CD and PMD effect. We prove that the CD-adjusted cyclic correlation matrix contains full information about timing and PMD, and the determinant of the matrix is a timing phase detector immune to both CD and PMD. We also obtain other results such as a completely PMD-independent CD estimator, etc. Our analysis is supported by both simulations and experiments over a field implemented optical cable. △ Less

Submitted 29 August, 2022; originally announced August 2022.

arXiv:2208.13146 [pdf, other]

Generative Modelling of the Ageing Heart with Cross-Sectional Imaging and Clinical Data

Authors: Mengyun Qiao, Berke Doga Basaran, Huaqi Qiu, Shuo Wang, Yi Guo, Yuanyuan Wang, Paul M. Matthews, Daniel Rueckert, Wenjia Bai

Abstract: Cardiovascular disease, the leading cause of death globally, is an age-related disease. Understanding the morphological and functional changes of the heart during ageing is a key scientific question, the answer to which will help us define important risk factors of cardiovascular disease and monitor disease progression. In this work, we propose a novel conditional generative model to describe the… ▽ More Cardiovascular disease, the leading cause of death globally, is an age-related disease. Understanding the morphological and functional changes of the heart during ageing is a key scientific question, the answer to which will help us define important risk factors of cardiovascular disease and monitor disease progression. In this work, we propose a novel conditional generative model to describe the changes of 3D anatomy of the heart during ageing. The proposed model is flexible and allows integration of multiple clinical factors (e.g. age, gender) into the generating process. We train the model on a large-scale cross-sectional dataset of cardiac anatomies and evaluate on both cross-sectional and longitudinal datasets. The model demonstrates excellent performance in predicting the longitudinal evolution of the ageing heart and modelling its data distribution. The codes are available at https://github.com/MengyunQ/AgeHeart. △ Less

Submitted 10 October, 2022; v1 submitted 28 August, 2022; originally announced August 2022.

arXiv:2208.02135 [pdf, ps, other]

Subject-Specific Lesion Generation and Pseudo-Healthy Synthesis for Multiple Sclerosis Brain Images

Authors: Berke Doga Basaran, Mengyun Qiao, Paul M. Matthews, Wenjia Bai

Abstract: Understanding the intensity characteristics of brain lesions is key for defining image-based biomarkers in neurological studies and for predicting disease burden and outcome. In this work, we present a novel foreground-based generative method for modelling the local lesion characteristics that can both generate synthetic lesions on healthy images and synthesize subject-specific pseudo-healthy imag… ▽ More Understanding the intensity characteristics of brain lesions is key for defining image-based biomarkers in neurological studies and for predicting disease burden and outcome. In this work, we present a novel foreground-based generative method for modelling the local lesion characteristics that can both generate synthetic lesions on healthy images and synthesize subject-specific pseudo-healthy images from pathological images. Furthermore, the proposed method can be used as a data augmentation module to generate synthetic images for training brain image segmentation networks. Experiments on multiple sclerosis (MS) brain images acquired on magnetic resonance imaging (MRI) demonstrate that the proposed method can generate highly realistic pseudo-healthy and pseudo-pathological brain images. Data augmentation using the synthetic images improves the brain image segmentation performance compared to traditional data augmentation methods as well as a recent lesion-aware data augmentation technique, CarveMix. The code will be released at https://github.com/dogabasaran/lesion-synthesis. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: 13 pages, 6 figures, 2022 MICCAI SASHIMI (Simulation and Synthesis in Medical Imaging) Workshop paper

arXiv:2207.06115 [pdf, other]

doi 10.1038/s41567-023-01952-5

Scalable and Programmable Phononic Network with Trapped Ions

Authors: Wentao Chen, Yao Lu, Shuaining Zhang, Kuan Zhang, Guanhao Huang, Mu Qiao, Xiaolu Su, Jialiang Zhang, Jingning Zhang, Leonardo Banchi, M. S. Kim, Kihwan Kim

Abstract: Controllable bosonic systems can provide post-classical computational power with sub-universal quantum computational capability. A network that consists of a number of bosons evolving through beam-splitters and phase-shifters between different modes, has been proposed and applied to demonstrate quantum advantages. While the network has been implemented mostly in optical systems with photons, recen… ▽ More Controllable bosonic systems can provide post-classical computational power with sub-universal quantum computational capability. A network that consists of a number of bosons evolving through beam-splitters and phase-shifters between different modes, has been proposed and applied to demonstrate quantum advantages. While the network has been implemented mostly in optical systems with photons, recently alternative realizations have been explored, where major limitations in photonic systems such as photon loss, and probabilistic manipulation can be addressed. Phonons, the quantized excitations of vibrational modes, of trapped ions can be a promising candidate to realize the bosonic network. Here, we experimentally demonstrate a minimal-loss phononic network that can be programmed and in which any phononic states are deterministically prepared and detected. We realize the network with up to four collective-vibrational modes, which can be straightforwardly extended to reveal quantum advantage. We benchmark the performance of the network with an exemplary algorithm of tomography for arbitrary multi-mode states with a fixed total phonon number. We obtain reconstruction fidelities of 94.5 $\pm$ 1.95 % and 93.4 $\pm$ 3.15 % for single-phonon and two-phonon states, respectively. Our experiment demonstrates a clear and novel pathway to scale up a phononic network for various quantum information processing beyond the limitations of classical and other quantum systems. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Journal ref: Nature Physics, 2023: 1-7

arXiv:2207.06106 [pdf, other]

Demonstration of multi-time quantum statistics without measurement back-action

Authors: Pengfei Wang, Hyukjoon Kwon, Chun-Yang Luan, Wentao Chen, Mu Qiao, Zinan Zhou, Kaizhao Wang, M. S. Kim, Kihwan Kim

Abstract: It is challenging to obtain quantum statistics of multiple time points due to the principle of quantum mechanics that a measurement disturbs the quantum state. We propose an ancilla-assisted measurement scheme that does not suffer from the measurement-induced back-action and experimentally demonstrate it using dual-species trapped ions. By ensemble averaging the ancilla-measurement outcomes with p… ▽ More It is challenging to obtain quantum statistics of multiple time points due to the principle of quantum mechanics that a measurement disturbs the quantum state. We propose an ancilla-assisted measurement scheme that does not suffer from the measurement-induced back-action and experimentally demonstrate it using dual-species trapped ions. By ensemble averaging the ancilla-measurement outcomes with properly chosen weights, quantum statistics, such as quantum correlation functions and quasi-probability distributions can be reconstructed. We employ $^{171}\rm{Yb}^+$-$^{138}\rm{Ba}^+$ ions as the system and the ancilla to perform multi-time measurements that consist of repeated initialization and detection of the ancilla state without effecting the system state. The two- and three-time quantum correlation functions and quasi-probability distributions are clearly revealed from experimental data. We successfully verify that the marginal distribution is unaffected by the measurement at each time and identify the nonclassicality of the reconstructed distribution. Our scheme can be applied for any $N$-time measurements of a general quantum process, which will be an essential tool for exploring properties of various quantum systems. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: 9 pages, 6 figures

MSC Class: 81P40

arXiv:2207.04909 [pdf, ps, other]

doi 10.1103/PhysRevA.105.063724

Tunable quantum interference effects in Floquet two- and three-level systems

Authors: Yingying Han, Minchen Qiao, Xiao-Qing Luo, Tie-Fu Li, Wenxian Zhang, Xiu-Hao Deng, J. Q. You, Dapeng Yu

Abstract: Quantum interference effects in the unmodulated quantum systems with light-matter interaction have been widely studied, such as electromagnetically induced transparency (EIT) and Autler-Townes splitting (ATS). However, the similar quantum interference effects in the Floquet systems (i.e., periodically modulated systems), which might cover rich new physics, were rarely studied. In this article, we… ▽ More Quantum interference effects in the unmodulated quantum systems with light-matter interaction have been widely studied, such as electromagnetically induced transparency (EIT) and Autler-Townes splitting (ATS). However, the similar quantum interference effects in the Floquet systems (i.e., periodically modulated systems), which might cover rich new physics, were rarely studied. In this article, we investigate the quantum interference effects in the Floquet two- and three-level systems analytically and numerically. We show a coherent destruction tunneling effect in a lotuslike multipeak spectrum with a Floquet two-level system, where the intensity of the probe field is periodically modulated with a square-wave sequence. We demonstrate that the multipeak split into multiple transparency windows with tunable quantum interference if the Floquet system is asynchronously controlled via a third level. Based on phenomenological analysis with Akaike information criterion, we show that the symmetric central transparency window has a similar mechanism to the traditional ATS or EIT depending on the choice of parameters, additional with an extra degree of freedom to control the quantum interference provided by the modulation period. The other transparent windows are shown to be asymmetric, different from the traditional ATS and EIT windows. These nontrivial quantum interference effects open up a scope to explore the applications of the Floquet systems. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2206.14431 [pdf, other]

Open Problem: Properly learning decision trees in polynomial time?

Authors: Guy Blanc, Jane Lange, Mingda Qiao, Li-Yang Tan

Abstract: The authors recently gave an $n^{O(\log\log n)}$ time membership query algorithm for properly learning decision trees under the uniform distribution (Blanc et al., 2021). The previous fastest algorithm for this problem ran in $n^{O(\log n)}$ time, a consequence of Ehrenfeucht and Haussler (1989)'s classic algorithm for the distribution-free setting. In this article we highlight the natural open pr… ▽ More The authors recently gave an $n^{O(\log\log n)}$ time membership query algorithm for properly learning decision trees under the uniform distribution (Blanc et al., 2021). The previous fastest algorithm for this problem ran in $n^{O(\log n)}$ time, a consequence of Ehrenfeucht and Haussler (1989)'s classic algorithm for the distribution-free setting. In this article we highlight the natural open problem of obtaining a polynomial-time algorithm, discuss possible avenues towards obtaining it, and state intermediate milestones that we believe are of independent interest. △ Less

Submitted 29 June, 2022; originally announced June 2022.

Comments: 5 pages, to appear at the Open Problem sessions at COLT 2022

arXiv:2206.00311 [pdf, other]

MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining

Authors: Pengyuan Lyu, Chengquan Zhang, Shanshan Liu, Meina Qiao, Yangliu Xu, Liang Wu, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

Abstract: Text images contain both visual and linguistic information. However, existing pre-training techniques for text recognition mainly focus on either visual representation learning or linguistic knowledge learning. In this paper, we propose a novel approach MaskOCR to unify vision and language pre-training in the classical encoder-decoder recognition framework. We adopt the masked image modeling appro… ▽ More Text images contain both visual and linguistic information. However, existing pre-training techniques for text recognition mainly focus on either visual representation learning or linguistic knowledge learning. In this paper, we propose a novel approach MaskOCR to unify vision and language pre-training in the classical encoder-decoder recognition framework. We adopt the masked image modeling approach to pre-train the feature encoder using a large set of unlabeled real text images, which allows us to learn strong visual representations. In contrast to introducing linguistic knowledge with an additional language model, we directly pre-train the sequence decoder. Specifically, we transform text data into synthesized text images to unify the data modalities of vision and language, and enhance the language modeling capability of the sequence decoder using a proposed masked image-language modeling scheme. Significantly, the encoder is frozen during the pre-training phase of the sequence decoder. Experimental results demonstrate that our proposed method achieves superior performance on benchmark datasets, including Chinese and English text images. △ Less

Submitted 9 October, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

arXiv:2205.13310 [pdf, other]

doi 10.3847/1538-4357/ac7392

The Physical Properties of Star-Forming Galaxies with Strong [O III] Lines at z=3.25

Authors: Run Wen, FangXia An, Xian Zhong Zheng, Dong Dong Shi, Jianbo Qin, Valentino Gonzalez, Fuyan Bian, Haiguang Xu, Zhizheng Pan, Qing-Hua Tan, Wenhao Liu, Min Fang, Jian Ren, Yu Heng Zhang, Man Qiao, Shuang Liu

Abstract: We present an analysis of physical properties of 34 [O III] emission-line galaxies (ELGs) at z=3.254$\pm$0.029 in the Extended Chandra Deep Field South (ECDFS). These ELGs are selected from deep narrow H2S(1) and broad Ks imaging of 383 arcmin$^{2}$ obtained with CFHT/WIRCam. We construct spectral energy distributions (SEDs) from U to Ks to derive the physical properties of ELGs. These [O III] ELG… ▽ More We present an analysis of physical properties of 34 [O III] emission-line galaxies (ELGs) at z=3.254$\pm$0.029 in the Extended Chandra Deep Field South (ECDFS). These ELGs are selected from deep narrow H2S(1) and broad Ks imaging of 383 arcmin$^{2}$ obtained with CFHT/WIRCam. We construct spectral energy distributions (SEDs) from U to Ks to derive the physical properties of ELGs. These [O III] ELGs are identified as starburst galaxies with strong [O III] lines of L([O III]) ~ 10$^{42.6}$ - 10$^{44.2}$ erg s$^{-1}$, and have stellar masses of M* ~ 10$^{9.0}$-10$^{10.6}$ M$_\odot$ and star formation rates of ~ 10-210 M$_\odot$ yr$^{-1}$. Our results show that 24% of our sample galaxies are dusty with Av > 1 mag and EW(OIII)$_{rest}$ ~ 70-500 $Å$, which are often missed in optically selected [O III] ELG samples. Their rest-frame UV and optical morphologies from HST/ACS and HST/WFC3 deep imaging reveal that these [O III] ELGs are mostly multiple-component systems (likely mergers) or compact. And 20% of them are nearly invisible in the rest-frame UV owing to heavy dust attenuation. Interestingly, we find that our samples reside in an overdensity consisting of two components: one southeast (SE) with an overdensity factor of $δ_{gal}$ ~ 41 over a volume of 13$^{3}$ cMpc$^{3}$ and the other northwest (NW) with $δ_{gal}$ ~ 38 over a volume of 10$^{3}$ cMpc$^{3}$. The two overdense substructures are expected to be virialized at z=0 with a total mass of ~ 1.1 x 10$^{15}$ M$_\odot$ and ~ 4.8 x 10$^{14}$ M$_\odot$, and probably merge into a Coma-like galaxy cluster. △ Less

Submitted 22 July, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

Comments: 22 pages, 11 figures, 3 tables. Accepted for publication in ApJ

Journal ref: Astrophysics Journal, year:2022, month:july, volume:933, pages:50

arXiv:2204.09924 [pdf, other]

Progressive Training of A Two-Stage Framework for Video Restoration

Authors: Meisong Zheng, Qunliang Xing, Minglang Qiao, Mai Xu, Lai Jiang, Huaida Liu, Ying Chen

Abstract: As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among video restorations, compressed video quality enhancement and video super-resolution are two of the main tacks with significant values in practical scenarios. Recently, recurrent neural networks and transformers attract in… ▽ More As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among video restorations, compressed video quality enhancement and video super-resolution are two of the main tacks with significant values in practical scenarios. Recently, recurrent neural networks and transformers attract increasing research interests in this field, due to their impressive capability in sequence-to-sequence modeling. However, the training of these models is not only costly but also relatively hard to converge, with gradient exploding and vanishing problems. To cope with these problems, we proposed a two-stage framework including a multi-frame recurrent network and a single-frame transformer. Besides, multiple training strategies, such as transfer learning and progressive training, are developed to shorten the training time and improve the model performance. Benefiting from the above technical contributions, our solution wins two champions and a runner-up in the NTIRE 2022 super-resolution and quality enhancement of compressed video challenges. Code is available at https://github.com/ryanxingql/winner-ntire22-vqe. △ Less

Submitted 4 February, 2023; v1 submitted 21 April, 2022; originally announced April 2022.

Comments: Winning two championships and one runner-up in the NTIRE 2022 challenge on super-resolution and quality enhancement of compressed video; Accepted to CVPRW 2022

arXiv:2204.09314 [pdf, other]

NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results

Authors: Ren Yang, Radu Timofte, Meisong Zheng, Qunliang Xing, Minglang Qiao, Mai Xu, Lai Jiang, Huaida Liu, Ying Chen, Youcheng Ben, Xiao Zhou, Chen Fu, Pei Cheng, Gang Yu, Junyi Li, Renlong Wu, Zhilu Zhang, Wei Shang, Zhengyao Lv, Yunjin Chen, Mingcai Zhou, Dongwei Ren, Kai Zhang, Wangmeng Zuo, Pavel Ostyakov , et al. (54 additional authors not shown)

Abstract: This paper reviews the NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video. In this challenge, we proposed the LDV 2.0 dataset, which includes the LDV dataset (240 videos) and 95 additional videos. This challenge includes three tracks. Track 1 aims at enhancing the videos compressed by HEVC at a fixed QP. Track 2 and Track 3 target both the super-resolution and qua… ▽ More This paper reviews the NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video. In this challenge, we proposed the LDV 2.0 dataset, which includes the LDV dataset (240 videos) and 95 additional videos. This challenge includes three tracks. Track 1 aims at enhancing the videos compressed by HEVC at a fixed QP. Track 2 and Track 3 target both the super-resolution and quality enhancement of HEVC compressed video. They require x2 and x4 super-resolution, respectively. The three tracks totally attract more than 600 registrations. In the test phase, 8 teams, 8 teams and 12 teams submitted the final results to Tracks 1, 2 and 3, respectively. The proposed methods and solutions gauge the state-of-the-art of super-resolution and quality enhancement of compressed video. The proposed LDV 2.0 dataset is available at https://github.com/RenYang-home/LDV_dataset. The homepage of this challenge (including open-sourced codes) is at https://github.com/RenYang-home/NTIRE22_VEnh_SR. △ Less

Submitted 25 April, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

arXiv:2204.07283 [pdf, other]

Observing frustrated quantum magnetism in two-dimensional ion crystals

Authors: Mu Qiao, Zhengyang Cai, Ye Wang, Botao Du, Naijun Jin, Wentao Chen, Pengfei Wang, Chunyang Luan, Erfu Gao, Ximo Sun, Haonan Tian, Jingning Zhang, Kihwan Kim

Abstract: Two-dimensional (2D) quantum magnetism is a paradigm in strongly correlated many-body physics. The understanding of 2D quantum magnetism can be expedited by employing a controllable quantum simulator that faithfully maps 2D-spin Hamiltonians. The 2D quantum simulators can exhibit exotic phenomena such as frustrated quantum magnetism and topological order and can be used to show quantum computation… ▽ More Two-dimensional (2D) quantum magnetism is a paradigm in strongly correlated many-body physics. The understanding of 2D quantum magnetism can be expedited by employing a controllable quantum simulator that faithfully maps 2D-spin Hamiltonians. The 2D quantum simulators can exhibit exotic phenomena such as frustrated quantum magnetism and topological order and can be used to show quantum computational advantages. Many experimental platforms are being developed, including Rydberg atoms and superconducting annealers. However, with trapped-ion systems, which showed the most advanced controllability and quantum coherence, quantum magnetism was explored in one-dimensional chains. Here, we report simulations of frustrated quantum magnetism with 2D ion crystals. We create a variety of spin-spin interactions for quantum magnets, including those that exhibit frustration by driving different vibrational modes and adiabatically prepare the corresponding ground states. The experimentally measured ground states are consistent with the theoretical predictions and are highly degenerate for geometrically frustrated spin models in two dimensions. Quantum coherence of the ground states is probed by reversing the time evolution of the B-field to the initial value and then measuring the extent to which the remaining state coincides with the initial state. Our results open the door for quantum simulations with 2D ion crystals. △ Less

Submitted 14 April, 2022; originally announced April 2022.

Comments: 11 pages, 9 figures

arXiv:2204.01489 [pdf, other]

Towards a New Science of Disinformation

Authors: Claudio S. Pinhanez, German H. Flores, Marisa A. Vasconcelos, Mu Qiao, Nick Linck, Rogério de Paula, Yuya J. Ong

Abstract: How can we best address the dangerous impact that deep learning-generated fake audios, photographs, and videos (a.k.a. deepfakes) may have in personal and societal life? We foresee that the availability of cheap deepfake technology will create a second wave of disinformation where people will receive specific, personalized disinformation through different channels, making the current approaches to… ▽ More How can we best address the dangerous impact that deep learning-generated fake audios, photographs, and videos (a.k.a. deepfakes) may have in personal and societal life? We foresee that the availability of cheap deepfake technology will create a second wave of disinformation where people will receive specific, personalized disinformation through different channels, making the current approaches to fight disinformation obsolete. We argue that fake media has to be seen as an upcoming cybersecurity problem, and we have to shift from combating its spread to a prevention and cure framework where users have available ways to verify, challenge, and argue against the veracity of each piece of media they are exposed to. To create the technologies behind this framework, we propose that a new Science of Disinformation is needed, one which creates a theoretical framework both for the processes of communication and consumption of false content. Key scientific and technological challenges facing this research agenda are listed and discussed in the light of state-of-art technologies for fake media generation and detection, argument finding and construction, and how to effectively engage users in the prevention and cure processes. △ Less

Submitted 17 March, 2022; originally announced April 2022.

arXiv:2203.09260 [pdf, other]

doi 10.1093/mnras/stac824

Submillimetre galaxies in two massive protoclusters at z = 2.24: witnessing the enrichment of extreme starbursts in the outskirts of HAE density peaks

Authors: Yuheng Zhang, Xian Zhong Zheng, Dong Dong Shi, Yu Gao, Helmut Dannerbauer, Fang Xia An, Xinwen Shu, Zhen-Kai Gao, Wei-Hao Wang, Xin Wang, Zheng Cai, Xiaohui Fan, Min Fang, Zhizheng Pan, Wenhao Liu, Qinghua Tan, Jianbo Qin, Jian Ren, Man Qiao, Run Wen, Shuang Liu

Abstract: Submillimetre galaxies represent a rapid growth phase of both star formation and massive galaxies. Mapping SMGs in galaxy protoclusters provides key insights into where and how these extreme starbursts take place in connections with the assembly of the large-scale structure in the early Universe. We search for SMGs at 850$\,μm$ using JCMT/SCUBA-2 in two massive protoclusters at $z=2.24$, BOSS1244… ▽ More Submillimetre galaxies represent a rapid growth phase of both star formation and massive galaxies. Mapping SMGs in galaxy protoclusters provides key insights into where and how these extreme starbursts take place in connections with the assembly of the large-scale structure in the early Universe. We search for SMGs at 850$\,μm$ using JCMT/SCUBA-2 in two massive protoclusters at $z=2.24$, BOSS1244 and BOSS1542, and detect 43 and 54 sources with $S_{850}>4\,$mJy at the $4σ$ level within an effective area of 264$\,$arcmin$^2$, respectively. We construct the intrinsic number counts and find that the abundance of SMGs is $2.0\pm0.3$ and $2.1\pm0.2$ times that of the general fields, confirming that BOSS1244 and BOSS1542 contain a higher fraction of dusty galaxies with strongly enhanced star formation. The volume densities of the SMGs are estimated to be $\sim15-$30 times the average, significantly higher than the overdensity factor ($\sim 6$) traced by H$α$ emission-line galaxies (HAEs). More importantly, we discover a prominent offset between the spatial distributions of the two populations in these two protoclusters -- SMGs are mostly located around the high-density regions of HAEs, and few are seen inside these regions. This finding may have revealed for the first time the occurrence of violent star formation enhancement in the outskirts of the HAE density peaks, likely driven by the boosting of gas supplies and/or starburst triggering events. Meanwhile, the lack of SMGs inside the most overdense regions at $z\sim2$ implies a transition to the environment disfavouring extreme starbursts. △ Less

Submitted 21 March, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

Comments: 16pages, 11figures, accepted for publication in MNRAS

arXiv:2201.05467 [pdf, ps, other]

doi 10.1093/mnras/stac132

Systematic biases in determining dust attenuation curves through galaxy SED fitting

Authors: Jianbo Qin, Xian Zhong Zheng, Min Fang, Zhizheng Pan, Stijn Wuyts, Yong Shi, Yingjie Peng, Valentino Gonzalez, Fuyan Bian, Jia-Sheng Huang, Qiu-Sheng Gu, Wenhao Liu, Qinghua Tan, Dong Dong Shi, Jian Ren, Yuheng Zhang, Man Qiao, Run Wen, Shuang Liu

Abstract: While the slope of the dust attenuation curve ($δ$) is found to correlate with effective dust attenuation ($A_V$) as obtained through spectral energy distribution (SED) fitting, it remains unknown how the fitting degeneracies shape this relation. We examine the degeneracy effects by fitting SEDs of a sample of local star-forming galaxies (SFGs) selected from the Galaxy And Mass Assembly survey, in… ▽ More While the slope of the dust attenuation curve ($δ$) is found to correlate with effective dust attenuation ($A_V$) as obtained through spectral energy distribution (SED) fitting, it remains unknown how the fitting degeneracies shape this relation. We examine the degeneracy effects by fitting SEDs of a sample of local star-forming galaxies (SFGs) selected from the Galaxy And Mass Assembly survey, in conjunction with mock galaxy SEDs of known attenuation parameters. A well-designed declining starburst star formation history is adopted to generate model SED templates with intrinsic UV slope ($β_0$) spanning over a reasonably wide range. The best-fitting $β_0$ for our sample SFGs shows a wide coverage, dramatically differing from the limited range of $β_0<-2.2$ for a starburst of constant star formation. Our results show that strong degeneracies between $β_0$, $δ$, and $A_V$ in the SED fitting induce systematic biases leading to a false $A_V$--$δ$ correlation. Our simulation tests reveal that this relationship can be well reproduced even when a flat $A_V$--$δ$ relation is taken to build the input model galaxy SEDs. The variations in best-fitting $δ$ are dominated by the fitting errors. We show that assuming a starburst with constant star formation in SED fitting will result in a steeper attenuation curve, smaller degeneracy errors, and a stronger $A_V$--$δ$ relation. Our findings confirm that the $A_V$--$δ$ relation obtained through SED fitting is likely driven by the systematic biases induced by the fitting degeneracies between $β_0$, $δ$, and $A_V$. △ Less

Submitted 14 January, 2022; originally announced January 2022.

Comments: 21 pages, 13 figures, accepted for publication in the MNRAS, Comments welcome!

arXiv:2112.13612 [pdf, other]

doi 10.1126/sciadv.abk1660

Significant-loophole-free test of Kochen-Specker contextuality using two species of atomic-ions

Authors: Pengfei Wang, Junhua Zhang, Chun-Yang Luan, Mark Um, Ye Wang, Mu Qiao, Tian Xie, Jing-Ning Zhang, Adán Cabello, Kihwan Kim

Abstract: Quantum measurements cannot be thought of as revealing preexisting results, even when they do not disturb any other measurement in the same trial. This feature is called contextuality and is crucial for the quantum advantage in computing. Here, we report the first observation of quantum contextuality simultaneously free of the detection, sharpness and compatibility loopholes. The detection and sha… ▽ More Quantum measurements cannot be thought of as revealing preexisting results, even when they do not disturb any other measurement in the same trial. This feature is called contextuality and is crucial for the quantum advantage in computing. Here, we report the first observation of quantum contextuality simultaneously free of the detection, sharpness and compatibility loopholes. The detection and sharpness loopholes are closed by adopting a hybrid two-ion system and highly efficient fluorescence measurements offering a detection efficiency of $100\%$ and a measurement repeatability $>98\%$. The compatibility loophole is closed by targeting correlations between observables for two different ions in a Paul trap, a $^{171}\mathrm{Yb}^{+}$ ion and a $^{138}\mathrm{Ba}^{+}$ ion, chosen so measurements on each ion use different operation laser wavelengths, fluorescence wavelengths, and detectors. The experimental results show a violation of the bound for the most adversarial noncontextual models and open a new way to certify quantum systems. △ Less

Submitted 27 December, 2021; originally announced December 2021.

Comments: 8 pages, 6 figures, 1 table, 65 references

MSC Class: 81P13(Primary) 78A37(Secondary)

Journal ref: Science Advances 8, eabk1660 (2022)

arXiv:2112.06466 [pdf, other]

doi 10.1093/mnras/stab3633

The cosmic environment overtakes the local density in shaping galaxy star formation

Authors: Jian Ren, Zhizheng Pan, XianZhong Zheng, Jianbo Qin, DongDong Shi, Valentino Gonzalez, Fuyan Bian, Jia-Sheng Huang, Min Fang, Wenhao Liu, Run Wen, Yuheng Zhang, Man Qiao, Shuang Liu

Abstract: The gas supply from the cosmic web is the key to sustain star formation in galaxies. It remains to be explored how the cosmic large-scale structure (LSS) effects on galaxy evolution at given local environments. We examine galaxy specific star formation rate as a function of local density in a LSS at $z=0.735$ in the Extended Chandra Deep Field South. The LSS is mapped by 732 galaxies with $R<24$\,… ▽ More The gas supply from the cosmic web is the key to sustain star formation in galaxies. It remains to be explored how the cosmic large-scale structure (LSS) effects on galaxy evolution at given local environments. We examine galaxy specific star formation rate as a function of local density in a LSS at $z=0.735$ in the Extended Chandra Deep Field South. The LSS is mapped by 732 galaxies with $R<24$\,mag and redshift at $0.72\le z \le 0.75$ collected from the literature and our spectroscopic observations with Magellan/IMACS, consisting of five galaxy clusters/groups and surrounding filaments over an area of $23.9 \times22.7$\,co-moving\,Mpc$^2$. The spread of spectroscopic redshifts corresponds a velocity dispersion of 494\,km\,s$^{-1}$, indicating the LSS likely to be a thin sheet with a galaxy density $\gtrsim 3.9$ times that of the general field. These clusters/groups in this LSS mostly exhibit elongated morphologies and multiple components connected with surrounding filaments. Strikingly, we find that star-forming galaxies in the LSS keep star formation at the same level as field, and show no dependence on local density but stellar mass. Meanwhile, an increasing fraction of quiescent galaxies is detected at increasing local density in both the LSS and the field, consistent with the expectation that galaxy mass and local dense environment hold the key to quench star formation. Combined together, we conclude that the cosmic environment of the LSS overtakes the local environment in remaining galaxy star formation to the level of the field. △ Less

Submitted 13 December, 2021; originally announced December 2021.

Comments: 14 pages, 13 figures, Accepted for publication in MNRAS

arXiv:2111.08567 [pdf, other]

Joint Learning of Visual-Audio Saliency Prediction and Sound Source Localization on Multi-face Videos

Authors: Minglang Qiao, Yufan Liu, Mai Xu, Xin Deng, Bing Li, Weiming Hu, Ali Borji

Abstract: Visual and audio events simultaneously occur and both attract attention. However, most existing saliency prediction works ignore the influence of audio and only consider vision modality. In this paper, we propose a multitask learning method for visual-audio saliency prediction and sound source localization on multi-face video by leveraging visual, audio and face information. Specifically, we first… ▽ More Visual and audio events simultaneously occur and both attract attention. However, most existing saliency prediction works ignore the influence of audio and only consider vision modality. In this paper, we propose a multitask learning method for visual-audio saliency prediction and sound source localization on multi-face video by leveraging visual, audio and face information. Specifically, we first introduce a large-scale database of multi-face video in visual-audio condition (MVVA), containing eye-tracking data and sound source annotations. Using this database, we find that sound influences human attention, and conversly attention offers a cue to determine sound source on multi-face video. Guided by these findings, a visual-audio multi-task network (VAM-Net) is introduced to predict saliency and locate sound source. VAM-Net consists of three branches corresponding to visual, audio and face modalities. Visual branch has a two-stream architecture to capture spatial and temporal information. Face and audio branches encode audio signals and faces, respectively. Finally, a spatio-temporal multi-modal graph (STMG) is constructed to model the interaction among multiple faces. With joint optimization of these branches, the intrinsic correlation of the tasks of saliency prediction and sound source localization is utilized and their performance is boosted by each other. Experiments show that the proposed method outperforms 12 state-of-the-art saliency prediction methods, and achieves competitive results in sound source localization. △ Less

Submitted 5 November, 2021; originally announced November 2021.

Comments: 21 pages, 15 figures

arXiv:2109.05287 [pdf, other]

Dual-view Snapshot Compressive Imaging via Optical Flow Aided Recurrent Neural Network

Authors: Ruiying Lu, Bo Chen, Guanliang Liu, Ziheng Cheng, Mu Qiao, Xin Yuan

Abstract: Dual-view snapshot compressive imaging (SCI) aims to capture videos from two field-of-views (FoVs) using a 2D sensor (detector) in a single snapshot, achieving joint FoV and temporal compressive sensing, and thus enjoying the advantages of low-bandwidth, low-power, and low-cost. However, it is challenging for existing model-based decoding algorithms to reconstruct each individual scene, which usua… ▽ More Dual-view snapshot compressive imaging (SCI) aims to capture videos from two field-of-views (FoVs) using a 2D sensor (detector) in a single snapshot, achieving joint FoV and temporal compressive sensing, and thus enjoying the advantages of low-bandwidth, low-power, and low-cost. However, it is challenging for existing model-based decoding algorithms to reconstruct each individual scene, which usually require exhaustive parameter tuning with extremely long running time for large scale data. In this paper, we propose an optical flow-aided recurrent neural network for dual video SCI systems, which provides high-quality decoding in seconds. Firstly, we develop a diversity amplification method to enlarge the differences between scenes of two FoVs, and design a deep convolutional neural network with dual branches to separate different scenes from the single measurement. Secondly, we integrate the bidirectional optical flow extracted from adjacent frames with the recurrent neural network to jointly reconstruct each video in a sequential manner. Extensive results on both simulation and real data demonstrate the superior performance of our proposed model in a short inference time. The code and data are available at https://github.com/RuiyingLu/OFaNet-for-Dual-view-SCI. △ Less

Submitted 11 September, 2021; originally announced September 2021.

arXiv:2109.00637 [pdf, ps, other]

Properly learning decision trees in almost polynomial time

Authors: Guy Blanc, Jane Lange, Mingda Qiao, Li-Yang Tan

Abstract: We give an $n^{O(\log\log n)}$-time membership query algorithm for properly and agnostically learning decision trees under the uniform distribution over $\{\pm 1\}^n$. Even in the realizable setting, the previous fastest runtime was $n^{O(\log n)}$, a consequence of a classic algorithm of Ehrenfeucht and Haussler. Our algorithm shares similarities with practical heuristics for learning decision tr… ▽ More We give an $n^{O(\log\log n)}$-time membership query algorithm for properly and agnostically learning decision trees under the uniform distribution over $\{\pm 1\}^n$. Even in the realizable setting, the previous fastest runtime was $n^{O(\log n)}$, a consequence of a classic algorithm of Ehrenfeucht and Haussler. Our algorithm shares similarities with practical heuristics for learning decision trees, which we augment with additional ideas to circumvent known lower bounds against these heuristics. To analyze our algorithm, we prove a new structural result for decision trees that strengthens a theorem of O'Donnell, Saks, Schramm, and Servedio. While the OSSS theorem says that every decision tree has an influential variable, we show how every decision tree can be "pruned" so that every variable in the resulting tree is influential. △ Less

Submitted 1 November, 2021; v1 submitted 1 September, 2021; originally announced September 2021.

Comments: 21 pages, to appear in FOCS 2021

arXiv:2107.00819 [pdf, other]

Decision tree heuristics can fail, even in the smoothed setting

Authors: Guy Blanc, Jane Lange, Mingda Qiao, Li-Yang Tan

Abstract: Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive. In fact, it has long been known that there are simple target functions for which they fail badly (Kearns and Mansour, STOC 1996). Recent work of Brutzkus, Daniely, and Malach (COLT 2020) considered the smoothed analysis model as a possibl… ▽ More Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive. In fact, it has long been known that there are simple target functions for which they fail badly (Kearns and Mansour, STOC 1996). Recent work of Brutzkus, Daniely, and Malach (COLT 2020) considered the smoothed analysis model as a possible avenue towards resolving this disconnect. Within the smoothed setting and for targets $f$ that are $k$-juntas, they showed that these heuristics successfully learn $f$ with depth-$k$ decision tree hypotheses. They conjectured that the same guarantee holds more generally for targets that are depth-$k$ decision trees. We provide a counterexample to this conjecture: we construct targets that are depth-$k$ decision trees and show that even in the smoothed setting, these heuristics build trees of depth $2^{Ω(k)}$ before achieving high accuracy. We also show that the guarantees of Brutzkus et al. cannot extend to the agnostic setting: there are targets that are very close to $k$-juntas, for which these heuristics build trees of depth $2^{Ω(k)}$ before achieving high accuracy. △ Less

Submitted 2 July, 2021; originally announced July 2021.

Comments: To appear in RANDOM 2021

Showing 1–50 of 87 results for author: Qiao, M