Search | arXiv e-print repository

From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions

Abstract: Selective labels occur when label observations are subject to a decision-making process; e.g., diagnoses that depend on the administration of laboratory tests. We study a clinically-inspired selective label problem called disparate censorship, where labeling biases vary across subgroups and unlabeled individuals are imputed as "negative" (i.e., no diagnostic test = no illness). Machine learning mo… ▽ More Selective labels occur when label observations are subject to a decision-making process; e.g., diagnoses that depend on the administration of laboratory tests. We study a clinically-inspired selective label problem called disparate censorship, where labeling biases vary across subgroups and unlabeled individuals are imputed as "negative" (i.e., no diagnostic test = no illness). Machine learning models naively trained on such labels could amplify labeling bias. Inspired by causal models of selective labels, we propose Disparate Censorship Expectation-Maximization (DCEM), an algorithm for learning in the presence of disparate censorship. We theoretically analyze how DCEM mitigates the effects of disparate censorship on model performance. We validate DCEM on synthetic data, showing that it improves bias mitigation (area between ROC curves) without sacrificing discriminative performance (AUC) compared to baselines. We achieve similar results in a sepsis classification task using clinical data. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 39 pages, 33 figures. ICML 2024 conference paper

arXiv:2406.16771 [pdf, other]

An antiferromagnetic diode effect in even-layered MnBi2Te4

Authors: Anyuan Gao, Shao-Wen Chen, Barun Ghosh, Jian-Xiang Qiu, Yu-Fei Liu, Yugo Onishi, Chaowei Hu, Tiema Qian, Damien Bérubé, Thao Dinh, Houchen Li, Christian Tzschaschel, Seunghyun Park, Tianye Huang, Shang-Wei Lien, Zhe Sun, Sheng-Chin Ho, Bahadur Singh, Kenji Watanabe, Takashi Taniguchi, David C. Bell, Arun Bansil, Hsin Lin, Tay-Rong Chang, Amir Yacoby , et al. (4 additional authors not shown)

Abstract: In a PN junction, the separation between positive and negative charges leads to diode transport. In the past few years, the intrinsic diode transport in noncentrosymmetric polar conductors has attracted great interest, because it suggests novel nonlinear applications and provides a symmetry-sensitive probe of Fermi surface. Recently, such studies have been extended to noncentrosymmetric supercondu… ▽ More In a PN junction, the separation between positive and negative charges leads to diode transport. In the past few years, the intrinsic diode transport in noncentrosymmetric polar conductors has attracted great interest, because it suggests novel nonlinear applications and provides a symmetry-sensitive probe of Fermi surface. Recently, such studies have been extended to noncentrosymmetric superconductors, realizing the superconducting diode effect. Here, we show that, even in a centrosymmetric crystal without directional charge separation, the spins of an antiferromagnet (AFM) can generate a spatial directionality, leading to an AFM diode effect. We observe large second-harmonic transport in a nonlinear electronic device enabled by the compensated AFM state of even-layered MnBi2Te4. We also report a novel electrical sum-frequency generation (SFG), which has been rarely explored in contrast to the well-known optical SFG in wide-gap insulators. We demonstrate that the AFM enables an in-plane field-effect transistor and harvesting of wireless electromagnetic energy. The electrical SFG establishes a powerful method to study nonlinear electronics built by quantum materials. The AFM diode effect paves the way for potential device concepts including AFM logic circuits, self-powered AFM spintronics, and other applications that potentially bridge nonlinear electronics with AFM spintronics. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 33+8 pages, 14+2 figures

arXiv:2406.13131 [pdf, other]

When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Authors: Ting-Yun Chang, Jesse Thomason, Robin Jia

Abstract: This paper studies in-context learning (ICL) by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that al… ▽ More This paper studies in-context learning (ICL) by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that always predict the same label. We find that component accuracies are well-correlated across different demonstration sets and perturbations of prompt templates, even when the full-model accuracy varies greatly. Based on our findings, we propose component reweighting, which learns to linearly re-scale the component activations from a few labeled examples. Given 24 labeled examples, our method improves by an average of 6.0% accuracy points over 24-shot ICL across 8 tasks on Llama-2-7B. Overall, this paper both enriches our understanding of ICL and provides a practical method for improvement by examining model internals. △ Less

Submitted 24 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

Comments: fix typos and citations; appendix

arXiv:2406.09923 [pdf, other]

CliBench: Multifaceted Evaluation of Large Language Models in Clinical Decisions on Diagnoses, Procedures, Lab Tests Orders and Prescriptions

Authors: Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei Ping, Timothy S Chang, Wei Wang

Abstract: The integration of Artificial Intelligence (AI), especially Large Language Models (LLMs), into the clinical diagnosis process offers significant potential to improve the efficiency and accessibility of medical care. While LLMs have shown some promise in the medical domain, their application in clinical diagnosis remains underexplored, especially in real-world clinical practice, where highly sophis… ▽ More The integration of Artificial Intelligence (AI), especially Large Language Models (LLMs), into the clinical diagnosis process offers significant potential to improve the efficiency and accessibility of medical care. While LLMs have shown some promise in the medical domain, their application in clinical diagnosis remains underexplored, especially in real-world clinical practice, where highly sophisticated, patient-specific decisions need to be made. Current evaluations of LLMs in this field are often narrow in scope, focusing on specific diseases or specialties and employing simplified diagnostic tasks. To bridge this gap, we introduce CliBench, a novel benchmark developed from the MIMIC IV dataset, offering a comprehensive and realistic assessment of LLMs' capabilities in clinical diagnosis. This benchmark not only covers diagnoses from a diverse range of medical cases across various specialties but also incorporates tasks of clinical significance: treatment procedure identification, lab test ordering and medication prescriptions. Supported by structured output ontologies, CliBench enables a precise and multi-granular evaluation, offering an in-depth understanding of LLM's capability on diverse clinical tasks of desired granularity. We conduct a zero-shot evaluation of leading LLMs to assess their proficiency in clinical decision-making. Our preliminary results shed light on the potential and limitations of current LLMs in clinical settings, providing valuable insights for future advancements in LLM-powered healthcare. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Project page: https://clibench.github.io

arXiv:2406.08701 [pdf]

Impacts of Backside Insulation on the Dynamic On-Resistance of Lateral p-GaN HEMTs-on-Si

Authors: Yu-Xuan Wang, Mao-Chou Tai, Ting-Chang Chang, Wei-Chen Huang, Zeyu Wan, Simon Li, Simon Sze, Guangrui Xia

Abstract: We examined the effect of backside insulation on the dynamic on-resistance of lateral p-GaN HEMTs. To gain a comprehensive understanding of the dynamic onresistance difference between substrate grounded and substrate floating p-GaN HEMTs, we conducted in-circuit double pulse testing and long-term direct current (DC) bias stress. We have realized that while backside insulation can enhance the break… ▽ More We examined the effect of backside insulation on the dynamic on-resistance of lateral p-GaN HEMTs. To gain a comprehensive understanding of the dynamic onresistance difference between substrate grounded and substrate floating p-GaN HEMTs, we conducted in-circuit double pulse testing and long-term direct current (DC) bias stress. We have realized that while backside insulation can enhance the breakdown voltage of lateral p-GaN HEMTs, it also comes with a tradeoff in device reliability. Results through Sentaurus TCAD simulation suggest that the use of backside insulation in devices gradually disperses potential to the buffer barrier. As a result, the potential barrier at the buffer edge of the 2DEG channel decreases significantly, leading to considerable electron trappings at buffer traps. This breakdown voltage and reliability tradeoff also applies to HEMT technologies using insulating substrates. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2405.18881 [pdf, other]

Tuning-Free Alignment of Diffusion Models with Direct Noise Optimization

Authors: Zhiwei Tang, Jiangweizhi Peng, Jiasheng Tang, Mingyi Hong, Fan Wang, Tsung-Hui Chang

Abstract: In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as improving human preference. The central goal of the alignment problem is to adjust the distribution learned by diffusion models such that the generated samples maximize the target reward function. We propose a novel alignment appr… ▽ More In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as improving human preference. The central goal of the alignment problem is to adjust the distribution learned by diffusion models such that the generated samples maximize the target reward function. We propose a novel alignment approach, named Direct Noise Optimization (DNO), that optimizes the injected noise during the sampling process of diffusion models. By design, DNO is tuning-free and prompt-agnostic, as the alignment occurs in an online fashion during generation. We rigorously study the theoretical properties of DNO and also propose variants to deal with non-differentiable reward functions. Furthermore, we identify that naive implementation of DNO occasionally suffers from the out-of-distribution reward hacking problem, where optimized samples have high rewards but are no longer in the support of the pretrained distribution. To remedy this issue, we leverage classical high-dimensional statistics theory and propose to augment the DNO loss with certain probability regularization. We conduct extensive experiments on several popular reward functions trained on human feedback data and demonstrate that the proposed DNO approach achieves state-of-the-art reward scores as well as high image quality, all within a reasonable time budget for generation. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.15297 [pdf]

doi 10.1103/PhysRevB.109.184112

High-field magnetoelectric coupling and successive magnetic transitions in Mn-doped polar antiferromagnet Ni3TeO6

Authors: J. H. Zhang, L. Lin, C. Dong, Y. T. Chang, J. F. Wang, C. L. Lu, P. Z. Chen, W. J. Zhai, G. Z. Zhou, L. Huang, Y. S. Tang, S. H. Zheng, M. F. Liu, X. H. Zhou, Z. B. Yan, J. -M. Liu

Abstract: Among the 3d transition metal ions doped polar Ni3TeO6, Mn-doped Ni3TeO6 has stimulated great interest due to its high magnetic ordering temperature and complex magnetic phases, but the mechanism of magnetoelectric (ME) coupling is far from understood. Herein we report our systematic investigation of the chemical control of magnetism, metamagnetic transition, and ME properties of Ni3-xMnxTeO6 sing… ▽ More Among the 3d transition metal ions doped polar Ni3TeO6, Mn-doped Ni3TeO6 has stimulated great interest due to its high magnetic ordering temperature and complex magnetic phases, but the mechanism of magnetoelectric (ME) coupling is far from understood. Herein we report our systematic investigation of the chemical control of magnetism, metamagnetic transition, and ME properties of Ni3-xMnxTeO6 single crystals in high magnetic field (H) up to 52 T. We present a previously unreported weak ferromagnetic behavior appeared in the ab plane below 9.5 K in addition to the incommensurate helical and commensurate collinear antiferromagnetic states. In the low-field region, a spin-flop type metamagnetic transition without any hysteresis occurs at Hc1 for H // c, while another metamagnetic transition accompanied with a change in electric polarization is observed at Hc2 in the high-field region both for H // c and H // ab above 30 K, which can be attributed to the sudden rotation of magnetic moments at Ni2 sites. The ME measurements reveal that a first-order ME effect is observed in the low-T and low-H regions, while a second-order ME coupling term appears above 30 K in the magnetic field range of Hc1 < H < Hc2 for H // c and H < Hc2 for H // ab, both becoming significant with increasing temperature. Eventually, they are dominated by the second-order ME effect near the antiferromagnetic transition temperature. The present work demonstrates that Ni3-xMnxTeO6 is an exotic magnetoelectric material compared with Ni3TeO6 and its derivatives, thereby providing insights to better understand the magnetism and ME coupling in Ni3TeO6 and its derivatives. △ Less

Submitted 29 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: 30 pages with 8 figures

Journal ref: Phys. Rev. B 109, 184112 (2024)

arXiv:2405.13419 [pdf]

Percolation Effect Induced Significant Change of Complex Permittivity and Permeability for Silver-Epoxy Nano-Composites

Authors: Bo-Wei Tseng, Tsun-Hsu Chang

Abstract: The intricate interplay between complex permittivity and permeability constitutes the cornerstone of electromagnetic (EM) applications, enabling precise customization for various uses. This study employed silver-epoxy nano-composites to exemplify a conductor-insulator composite, leveraging silver's exceptional attributes, such as high conductivity and low reactivity. The determination of complex p… ▽ More The intricate interplay between complex permittivity and permeability constitutes the cornerstone of electromagnetic (EM) applications, enabling precise customization for various uses. This study employed silver-epoxy nano-composites to exemplify a conductor-insulator composite, leveraging silver's exceptional attributes, such as high conductivity and low reactivity. The determination of complex permittivity and permeability was conducted via the transmission/reflection method. At lower concentrations of dispersed silver particles, these nano-particles within the epoxy resin act as modest dipoles, augmenting permittivity. This regime aligns closely with the effective medium theory (EMT) and comprises the focus of much research. However, nearing the percolation threshold, a percolation effect emerges, drastically accelerating enhancement rates beyond the predictions of EMT. Simultaneously, long-wavelength electromagnetic waves induce diamagnetic currents within loops formed by metal grains. This diamagnetic effect intensifies with increasing volume fraction, leading to a reduction in permeability. This study observed percolation power law behavior near the threshold with calculated critical exponents. Consequently, the dielectric constant of the silver-epoxy nano-composite reached a maximum of 515. Regarding permeability, the lowest recorded value was 0.31. These findings were obtained within the X-band (8.2 GHz~12.4 GHz) region. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 10 figures

arXiv:2405.12235 [pdf]

Hypergraph: A Unified and Uniform Definition with Application to Chemical Hypergraph

Authors: Daniel T. Chang

Abstract: The conventional definition of hypergraph has two major issues: (1) there is not a standard definition of directed hypergraph and (2) there is not a formal definition of nested hypergraph. To resolve these issues, we propose a new definition of hypergraph that unifies the concepts of undirected, directed and nested hypergraphs, and that is uniform in using hyperedge as a single construct for repre… ▽ More The conventional definition of hypergraph has two major issues: (1) there is not a standard definition of directed hypergraph and (2) there is not a formal definition of nested hypergraph. To resolve these issues, we propose a new definition of hypergraph that unifies the concepts of undirected, directed and nested hypergraphs, and that is uniform in using hyperedge as a single construct for representing high-order correlations among things, i.e., nodes and hyperedges. Specifically, we define a hyperedge to be a simple hyperedge, a nesting hyperedge, or a directed hyperedge. With this new definition, a hypergraph is nested if it has nesting hyperedge(s), and is directed if it has directed hyperedge(s). Otherwise, a hypergraph is a simple hypergraph. The uniformity and power of this new definition, with visualization, should facilitate the use of hypergraph for representing (hierarchical) high-order correlations in general and chemical systems in particular. Graph has been widely used as a mathematical structure for machine learning on molecular structures and 3D molecular geometries. However, graph has a major limitation: it can represent only pairwise correlations between nodes. Hypergraph extends graph with high-order correlations among nodes. This extension is significant or essential for machine learning on chemical systems. For molecules, this is significant as it allows the direct, explicit representation of multicenter bonds and molecular substructures. For chemical reactions, this is essential since most chemical reactions involve multiple participants. We propose the use of chemical hypergraph, a multilevel hypergraph with simple, nesting and directed hyperedges, as a single mathematical structure for representing chemical systems. We apply the new definition of hypergraph to chemical hypergraph and, as simplified versions, molecular hypergraph and chemical reaction hypergraph. △ Less

Submitted 18 June, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.03623 by other authors

arXiv:2405.12144 [pdf]

Alterations of electrocortical activity during hand movements induced by motor cortex glioma

Authors: Yihan Wu, Tao Chang, Siliang Chen, Xiaodong Niu, Yu Li, Yuan Fang, Lei Yang, Yixuan Zong, Yaoxin Yang, Yuehua Li, Mengsong Wang, Wen Yang, Yixuan Wu, Chen Fu, Xia Fang, Yuxin Quan, Xilin Peng, Qiang Sun, Marc M. Van Hulle, Yanhui Liu, Ning Jiang, Dario Farina, Yuan Yang, Jiayuan He, Qing Mao

Abstract: Glioma cells can reshape functional neuronal networks by hijacking neuronal synapses, leading to partial or complete neurological dysfunction. These mechanisms have been previously explored for language functions. However, the impact of glioma on sensorimotor functions is still unknown. Therefore, we recruited a control group of patients with unaffected motor cortex and a group of patients with gl… ▽ More Glioma cells can reshape functional neuronal networks by hijacking neuronal synapses, leading to partial or complete neurological dysfunction. These mechanisms have been previously explored for language functions. However, the impact of glioma on sensorimotor functions is still unknown. Therefore, we recruited a control group of patients with unaffected motor cortex and a group of patients with glioma-infiltrated motor cortex, and recorded high-density electrocortical signals during finger movement tasks. The results showed that glioma suppresses task-related synchronization in the high-gamma band and reduces the power across all frequency bands. The resulting atypical motor information transmission model with discrete signaling pathways and delayed responses disrupts the stability of neuronal encoding patterns for finger movement kinematics across various temporal-spatial scales. These findings demonstrate that gliomas functionally invade neural circuits within the motor cortex. This result advances our understanding of motor function processing in chronic disease states, which is important to advance the surgical strategies and neurorehabilitation approaches for patients with malignant gliomas. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.08388 [pdf, other]

The Radio and Microwave Sky as Seen by Juno on its Mission to Jupiter

Authors: Christopher Anderson, Philippe Berger, Tzu-Ching Chang, Olivier Doré, Shannon Brown, Steve Levin, Michael Seiffert

Abstract: We present six nearly full-sky maps made from data taken by radiometers on the Juno satellite during its 5-year flight to Jupiter. The maps represent integrated emission over $\sim 4\%$ passbands spaced approximately in octaves between 600 MHz and 21.9 GHz. Long time-scale offset drifts are removed in all bands, and, for the two lowest frequency bands, gain drifts are also removed from the maps vi… ▽ More We present six nearly full-sky maps made from data taken by radiometers on the Juno satellite during its 5-year flight to Jupiter. The maps represent integrated emission over $\sim 4\%$ passbands spaced approximately in octaves between 600 MHz and 21.9 GHz. Long time-scale offset drifts are removed in all bands, and, for the two lowest frequency bands, gain drifts are also removed from the maps via a self-calibration algorithm similar to the NPIPE pipeline used by the Planck collaboration. We show that, after this solution is applied, residual noise in the maps is consistent with thermal radiometer noise. We verify our map solutions with several consistency tests and end-to-end simulations. We also estimate the level of pixelization noise and polarization leakage via simulations. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 15 pages, 13 figures

arXiv:2405.06763 [pdf, other]

Post-selection inference for causal effects after causal discovery

Authors: Ting-Hsuan Chang, Zijian Guo, Daniel Malinsky

Abstract: Algorithms for constraint-based causal discovery select graphical causal models among a space of possible candidates (e.g., all directed acyclic graphs) by executing a sequence of conditional independence tests. These may be used to inform the estimation of causal effects (e.g., average treatment effects) when there is uncertainty about which covariates ought to be adjusted for, or which variables… ▽ More Algorithms for constraint-based causal discovery select graphical causal models among a space of possible candidates (e.g., all directed acyclic graphs) by executing a sequence of conditional independence tests. These may be used to inform the estimation of causal effects (e.g., average treatment effects) when there is uncertainty about which covariates ought to be adjusted for, or which variables act as confounders versus mediators. However, naively using the data twice, for model selection and estimation, would lead to invalid confidence intervals. Moreover, if the selected graph is incorrect, the inferential claims may apply to a selected functional that is distinct from the actual causal effect. We propose an approach to post-selection inference that is based on a resampling and screening procedure, which essentially performs causal discovery multiple times with randomly varying intermediate test statistics. Then, an estimate of the target causal effect and corresponding confidence sets are constructed from a union of individual graph-based estimates and intervals. We show that this construction has asymptotically correct coverage for the true causal effect parameter. Importantly, the guarantee holds for a fixed population-level effect, not a data-dependent or selection-dependent quantity. Most of our exposition focuses on the PC-algorithm for learning directed acyclic graphs and the multivariate Gaussian case for simplicity, but the approach is general and modular, so it may be used with other conditional independence based discovery algorithms and distributional families. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.01610 [pdf, other]

Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

Authors: Noah Giebink, Amrita Gupta, Diogo Verìssimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett Dickson, Alex Bowmer, Jonathan Baillie

Abstract: Measuring public attitudes toward wildlife provides crucial insights into our relationship with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale is challenging. Manually curating search terms for querying news and social media is tedious, costly, and can lead to biased results. Raw news and social media data returned… ▽ More Measuring public attitudes toward wildlife provides crucial insights into our relationship with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale is challenging. Manually curating search terms for querying news and social media is tedious, costly, and can lead to biased results. Raw news and social media data returned from queries are often cluttered with irrelevant content and syndicated articles. We aim to overcome these challenges by leveraging modern Natural Language Processing (NLP) tools. We introduce a folk taxonomy approach for improved search term generation and employ cosine similarity on Term Frequency-Inverse Document Frequency vectors to filter syndicated articles. We also introduce an extensible relevance filtering pipeline which uses unsupervised learning to reveal common topics, followed by an open-source zero-shot Large Language Model (LLM) to assign topics to news article titles, which are then used to assign relevance. Finally, we conduct sentiment, topic, and volume analyses on resulting data. We illustrate our methodology with a case study of news and X (formerly Twitter) data before and during the COVID-19 pandemic for various mammal taxa, including bats, pangolins, elephants, and gorillas. During the data collection period, up to 62% of articles including keywords pertaining to bats were deemed irrelevant to biodiversity, underscoring the importance of relevance filtering. At the pandemic's onset, we observed increased volume and a significant sentiment shift toward horseshoe bats, which were implicated in the pandemic, but not for other focal taxa. The proposed methods open the door to conservation practitioners applying modern and emerging NLP tools, including LLMs "out of the box," to analyze public perceptions of biodiversity during current events or campaigns. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: v0.1, 21 pages with 10 figures

arXiv:2405.00369 [pdf, ps, other]

Singular velocity of the Stokes and Navier-Stokes equations near boundary in the half space

Authors: TongKeun Chang, Kyungkeun Kang

Abstract: Local behaviors near boundary are analyzed for solutions of the Stokes and Navier-Stoke equations in the half space with localized non-smooth boundary data. We construct solutions of Stokes equations whose velocity field is not bounded near boundary away from the support of boundary data, although velocity and gradient velocity of solutions are locally square integrable. This is an improvement com… ▽ More Local behaviors near boundary are analyzed for solutions of the Stokes and Navier-Stoke equations in the half space with localized non-smooth boundary data. We construct solutions of Stokes equations whose velocity field is not bounded near boundary away from the support of boundary data, although velocity and gradient velocity of solutions are locally square integrable. This is an improvement compared to known results in the sense that velocity field is unbounded itself, since previously constructed solutions were bounded near boundary, although their normal derivatives are singular. We also establish singular solutions and their derivatives that do not belong to $L^q_{\rm{loc}}$ near boundary with $q> 1$. For such examples, there corresponding pressures turn out not to be locally integrable. Similar construction via a perturbation argument is available to the Navier-Stokes equations near boundary as well. △ Less

Submitted 6 June, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.11808 [pdf, other]

Future Perspectives for Gamma-ray Burst Detection from Space

Authors: Enrico Bozzo, Lorenzo Amati, Wayne Baumgartner, Tzu-Ching Chang, Bertrand Cordier, Nicolas De Angelis, Akihiro Doi, Marco Feroci, Cynthia Froning, Jessica Gaskin, Adam Goldstein, Diego Götz, Jon E. Grove, Sylvain Guiriec, Margarita Hernanz, C. Michelle Hui, Peter Jenke, Daniel Kocevski, Merlin Kole, Chryssa Kouveliotou, Thomas Maccarone, Mark L. McConnell, Hideo Matsuhara, Paul O'Brien, Nicolas Produit , et al. (13 additional authors not shown)

Abstract: Since their first discovery in the late 1960s, Gamma-ray bursts have attracted an exponentially growing interest from the international community due to their central role in the most highly debated open questions of the modern research of astronomy, astrophysics, cosmology, and fundamental physics. These range from the intimate nuclear composition of high density material within the core of ultra… ▽ More Since their first discovery in the late 1960s, Gamma-ray bursts have attracted an exponentially growing interest from the international community due to their central role in the most highly debated open questions of the modern research of astronomy, astrophysics, cosmology, and fundamental physics. These range from the intimate nuclear composition of high density material within the core of ultra-dense neuron stars, to stellar evolution via the collapse of massive stars, the production and propagation of gravitational waves, as well as the exploration of the early Universe by unveiling first stars and galaxies (assessing also their evolution and cosmic re-ionization). GRBs have stimulated in the past $\sim$50 years the development of cutting-edge technological instruments for observations of high energy celestial sources from space, leading to the launch and successful operations of many different scientific missions (several of them still in data taking mode nowadays). In this review, we provide a brief description of the GRB-dedicated missions from space being designed and developed for the future. The list of these projects, not meant to be exhaustive, shall serve as a reference to interested readers to understand what is likely to come next to lead the further development of GRB research and associated phenomenology. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: Accepted for publication on Universe. Invited review, contribution to the Universe Special Issue "Recent Advances in Gamma Ray Astrophysics and Future Perspectives", P. Romano eds. (https://www.mdpi.com/journal/universe/special_issues/7299902Z97)

arXiv:2404.11017 [pdf, other]

doi 10.1117/12.2567224

SPHEREx: NASA's Near-Infrared Spectrophotmetric All-Sky Survey

Authors: Brendan P. Crill, Michael Werner, Rachel Akeson, Matthew Ashby, Lindsey Bleem, James J. Bock, Sean Bryan, Jill Burnham, Joyce Byunh, Tzu-Ching Chang, Yi-Kuan Chiang, Walter Cook, Asantha Cooray, Andrew Davis, Olivier Doré, C. Darren Dowell, Gregory Dubois-Felsmann, Tim Eifler, Andreas Faisst, Salman Habib, Chen Heinrich, Katrin Heitmann, Grigory Heaton, Christopher Hirata, Viktor Hristov , et al. (29 additional authors not shown)

Abstract: SPHEREx, the Spectro-Photometer for the History of the Universe, Epoch of Reionization, and ices Explorer, is a NASA MIDEX mission planned for launch in 2024. SPHEREx will carry out the first all-sky spectral survey at wavelengths between 0.75 micron and 5 micron with spectral resolving power ~40 between 0.75 and 3.8 micron and ~120 between 3.8 and 5 micron At the end of its two-year mission, SPHE… ▽ More SPHEREx, the Spectro-Photometer for the History of the Universe, Epoch of Reionization, and ices Explorer, is a NASA MIDEX mission planned for launch in 2024. SPHEREx will carry out the first all-sky spectral survey at wavelengths between 0.75 micron and 5 micron with spectral resolving power ~40 between 0.75 and 3.8 micron and ~120 between 3.8 and 5 micron At the end of its two-year mission, SPHEREx will provide 0.75-to-5 micron spectra of each 6.2"x6.2" pixel on the sky - 14 billion spectra in all. This paper updates an earlier description of SPHEREx presenting changes made during the mission's Preliminary Design Phase, including a discussion of instrument integration and test and a summary of the data processing, analysis, and distribution plans. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Journal ref: Proceedings Volume 11443, Space Telescopes and Instrumentation 2020: Optical, Infrared, and Millimeter Wave; 114430I (2020)

arXiv:2404.07436 [pdf, other]

Measurement of $e^{+}e^{-}\to ωη^{\prime}$ cross sections at $\sqrt{s}=$ 2.000 to 3.080 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

Abstract: The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be… ▽ More The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be $Γ_{R}=(167\pm77\pm7)~\rm{MeV}$, where the first uncertainties are statistical and the second are systematic. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.03840 [pdf]

FarView: An In-Situ Manufactured Lunar Far Side Radio Array Concept for 21-cm Dark Ages Cosmology

Authors: Ronald S. Polidan, Jack O. Burns, Alex Ignatiev, Alex Hegedus, Jonathan Pober, Nivedita Mahesh, Tzu-Ching Chang, Gregg Hallinan, Yuhong Ning, Judd Bowman

Abstract: FarView is an early-stage concept for a large, low-frequency radio observatory, manufactured in-situ on the lunar far side using metals extracted from the lunar regolith. It consists of 100,000 dipole antennas in compact subarrays distributed over a large area but with empty space between subarrays in a core-halo structure. FarView covers a total area of ~200 km2, has a dense core within the inner… ▽ More FarView is an early-stage concept for a large, low-frequency radio observatory, manufactured in-situ on the lunar far side using metals extracted from the lunar regolith. It consists of 100,000 dipole antennas in compact subarrays distributed over a large area but with empty space between subarrays in a core-halo structure. FarView covers a total area of ~200 km2, has a dense core within the inner ~36 km2, and a ~power-law falloff of antenna density out to ~14 km from the center. With this design, it is relatively easy to identify multiple viable build sites on the lunar far side. The science case for FarView emphasizes the unique capabilities to probe the unexplored Cosmic Dark Ages - identified by the 2020 Astrophysics Decadal Survey as the discovery area for cosmology. FarView will deliver power spectra and tomographic maps tracing the evolution of the Universe from before the birth of the first stars to the beginning of Cosmic Dawn, and potentially provide unique insights into dark matter, early dark energy, neutrino masses, and the physics of inflation. What makes FarView feasible and affordable in the timeframe of the 2030s is that it is manufactured in-situ, utilizing space industrial technologies. This in-situ manufacturing architecture utilizes Earth-built equipment that is transported to the lunar surface to extract metals from the regolith and will use those metals to manufacture most of the array components: dipole antennas, power lines, and silicon solar cell power systems. This approach also enables a long functional lifetime, by permitting servicing and repair of the observatory. The full 100,000 dipole FarView observatory will take 4 - 8 years to build, depending on the realized performance of the manufacturing elements and the lunar delivery scenario. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 26 pages, 7 figures, 2 tables

arXiv:2404.03586 [pdf, other]

Leveraging Interpolation Models and Error Bounds for Verifiable Scientific Machine Learning

Authors: Tyler Chang, Andrew Gillette, Romit Maulik

Abstract: Effective verification and validation techniques for modern scientific machine learning workflows are challenging to devise. Statistical methods are abundant and easily deployed, but often rely on speculative assumptions about the data and methods involved. Error bounds for classical interpolation techniques can provide mathematically rigorous estimates of accuracy, but often are difficult or impr… ▽ More Effective verification and validation techniques for modern scientific machine learning workflows are challenging to devise. Statistical methods are abundant and easily deployed, but often rely on speculative assumptions about the data and methods involved. Error bounds for classical interpolation techniques can provide mathematically rigorous estimates of accuracy, but often are difficult or impractical to determine computationally. In this work, we present a best-of-both-worlds approach to verifiable scientific machine learning by demonstrating that (1) multiple standard interpolation techniques have informative error bounds that can be computed or estimated efficiently; (2) comparative performance among distinct interpolants can aid in validation goals; (3) deploying interpolation methods on latent spaces generated by deep learning techniques enables some interpretability for black-box models. We present a detailed case study of our approach for predicting lift-drag ratios from airfoil images. Code developed for this work is available in a public Github repository. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2404.00898 [pdf, other]

CAAP: Class-Dependent Automatic Data Augmentation Based On Adaptive Policies For Time Series

Authors: Tien-Yu Chang, Hao Dai, Vincent S. Tseng

Abstract: Data Augmentation is a common technique used to enhance the performance of deep learning models by expanding the training dataset. Automatic Data Augmentation (ADA) methods are getting popular because of their capacity to generate policies for various datasets. However, existing ADA methods primarily focused on overall performance improvement, neglecting the problem of class-dependent bias that le… ▽ More Data Augmentation is a common technique used to enhance the performance of deep learning models by expanding the training dataset. Automatic Data Augmentation (ADA) methods are getting popular because of their capacity to generate policies for various datasets. However, existing ADA methods primarily focused on overall performance improvement, neglecting the problem of class-dependent bias that leads to performance reduction in specific classes. This bias poses significant challenges when deploying models in real-world applications. Furthermore, ADA for time series remains an underexplored domain, highlighting the need for advancements in this field. In particular, applying ADA techniques to vital signals like an electrocardiogram (ECG) is a compelling example due to its potential in medical domains such as heart disease diagnostics. We propose a novel deep learning-based approach called Class-dependent Automatic Adaptive Policies (CAAP) framework to overcome the notable class-dependent bias problem while maintaining the overall improvement in time-series data augmentation. Specifically, we utilize the policy network to generate effective sample-wise policies with balanced difficulty through class and feature information extraction. Second, we design the augmentation probability regulation method to minimize class-dependent bias. Third, we introduce the information region concepts into the ADA framework to preserve essential regions in the sample. Through a series of experiments on real-world ECG datasets, we demonstrate that CAAP outperforms representative methods in achieving lower class-dependent bias combined with superior overall performance. These results highlight the reliability of CAAP as a promising ADA method for time series modeling that fits for the demands of real-world applications. △ Less

Submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.19740 [pdf, other]

Bayesian Multi-line Intensity Mapping

Authors: Yun-Ting Cheng, Kailai Wang, Benjamin D. Wandelt, Tzu-Ching Chang, Olivier Dore

Abstract: Line intensity mapping (LIM) has emerged as a promising tool for probing the 3D large-scale structure (LSS) through the aggregate emission of spectral lines. The presence of interloper lines poses a crucial challenge in extracting the signal from the target line in LIM. In this work, we introduce a novel method for LIM analysis that simultaneously extracts line signals from multiple spectral lines… ▽ More Line intensity mapping (LIM) has emerged as a promising tool for probing the 3D large-scale structure (LSS) through the aggregate emission of spectral lines. The presence of interloper lines poses a crucial challenge in extracting the signal from the target line in LIM. In this work, we introduce a novel method for LIM analysis that simultaneously extracts line signals from multiple spectral lines, utilizing the covariance of native LIM data elements defined in the spectral-angular space. We leverage correlated information from different lines to perform joint inference on all lines simultaneously, employing a Bayesian analysis framework. We present the formalism, demonstrate our technique with a mock survey setup resembling the SPHEREx deep field observation, and consider four spectral lines within the SPHEREx spectral coverage in the near infrared: H$α$, $[$\ion{O}{3}$]$, H$β$, and $[$\ion{O}{2}$]$. We demonstrate that our method can extract the power spectrum of all four lines at $\gtrsim 10σ$ level at $z<2$. For the brightest line H$α$, the $10σ$ sensitivity can be achieved out to $z\sim3$. Our technique offers a flexible framework for LIM analysis, enabling simultaneous inference of signals from multiple line emissions while accommodating diverse modeling constraints and parametrizations. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 24 pages, 14 figures, submitted to ApJ

arXiv:2403.19091 [pdf, other]

Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra… ▽ More By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fractions are measured to be $\mathcal{B}(D^0\rightarrow {K}_1(1270)^-(\to K^0_Sπ^-π^0)e^+ν_e)=(1.69^{+0.53}_{-0.46}\pm0.15)\times10^{-4}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0(\to K^0_Sπ^+π^-)e^+ν_e)=(1.47^{+0.45}_{-0.40}\pm0.20)\times10^{-4}$ with statistical significance of 5.4$σ$ and 5.6$σ$, respectively. When combined with measurements of the $K_1(1270)\to K^+π^-π$ decays, the absolute branching fractions are determined to be $\mathcal{B}(D^0\to K_1(1270)^-e^+ν_e)=(1.05^{+0.33}_{-0.28}\pm0.12\pm0.12)\times10^{-3}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0e^+ν_e)=(1.29^{+0.40}_{-0.35}\pm0.18\pm0.15)\times10^{-3}$. The first and second uncertainties are statistical and systematic, respectively, and the third uncertainties originate from the assumed branching fractions of the $K_1(1270)\to Kππ$ decays. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 19pages

arXiv:2403.17891 [pdf, other]

Image-based Novel Fault Detection with Deep Learning Classifiers using Hierarchical Labels

Authors: Nurettin Sergin, Jiayu Huang, Tzyy-Shuh Chang, Hao Yan

Abstract: One important characteristic of modern fault classification systems is the ability to flag the system when faced with previously unseen fault types. This work considers the unknown fault detection capabilities of deep neural network-based fault classifiers. Specifically, we propose a methodology on how, when available, labels regarding the fault taxonomy can be used to increase unknown fault detec… ▽ More One important characteristic of modern fault classification systems is the ability to flag the system when faced with previously unseen fault types. This work considers the unknown fault detection capabilities of deep neural network-based fault classifiers. Specifically, we propose a methodology on how, when available, labels regarding the fault taxonomy can be used to increase unknown fault detection performance without sacrificing model performance. To achieve this, we propose to utilize soft label techniques to improve the state-of-the-art deep novel fault detection techniques during the training process and novel hierarchically consistent detection statistics for online novel fault detection. Finally, we demonstrated increased detection performance on novel fault detection in inspection images from the hot steel rolling process, with results well replicated across multiple scenarios and baseline detection methods. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted in IISE Transaction

arXiv:2403.14874 [pdf, other]

WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Nathan Wei, Matthew Waliman, Yunhao Ba, Celso de Melo, Alex Wong, Achuta Kadambi

Abstract: We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first… ▽ More We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first semantic segmentation dataset with accurate clear and adverse weather image pairs that share an underlying scene. Through this dataset, we analyze the error modes in existing models and found that they were sensitive to the highly complex combination of different weather effects induced on the image during capture. To improve robustness, we propose a way to use language as guidance by identifying contributions of adverse weather conditions and injecting that as "side information". Models trained using our language guidance exhibit performance gains by up to 10.2% in mIoU on WeatherProof, up to 8.44% in mIoU on the widely used ACDC dataset compared to standard training techniques, and up to 6.21% in mIoU on the ACDC dataset as compared to previous SOTA methods. △ Less

Submitted 7 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2312.09534

arXiv:2403.13754 [pdf, other]

Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement

Authors: Catherine Arnett, Pamela D. Rivière, Tyler A. Chang, Sean Trott

Abstract: The relationship between language model tokenization and performance is an open area of research. Here, we investigate how different tokenization schemes impact number agreement in Spanish plurals. We find that morphologically-aligned tokenization performs similarly to other tokenization schemes, even when induced artificially for words that would not be tokenized that way during training. We then… ▽ More The relationship between language model tokenization and performance is an open area of research. Here, we investigate how different tokenization schemes impact number agreement in Spanish plurals. We find that morphologically-aligned tokenization performs similarly to other tokenization schemes, even when induced artificially for words that would not be tokenized that way during training. We then present exploratory analyses demonstrating that language model embeddings for different plural tokenizations have similar distributions along the embedding space axis that maximally distinguishes singular and plural nouns. Our results suggest that morphologically-aligned tokenization is a viable tokenization approach, and existing models already generalize some morphological patterns to new items. However, our results indicate that morphological tokenization is not strictly required for performance. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.09188 [pdf]

Design of an basis-projected layer for sparse datasets in deep learning training using gc-ms spectra as a case study

Authors: Yu Tang Chang, Shih Fang Chen

Abstract: Deep learning (DL) models encompass millions or even billions of parameters and learn complex patterns from big data. However, not all data are initially stored in a suitable formation to effectively train a DL model, e.g., gas chromatography-mass spectrometry (GC-MS) spectra and DNA sequence. These datasets commonly contain many zero values, and the sparse data formation causes difficulties in op… ▽ More Deep learning (DL) models encompass millions or even billions of parameters and learn complex patterns from big data. However, not all data are initially stored in a suitable formation to effectively train a DL model, e.g., gas chromatography-mass spectrometry (GC-MS) spectra and DNA sequence. These datasets commonly contain many zero values, and the sparse data formation causes difficulties in optimizing DL models. A DL module called the basis-projected layer (BPL) was proposed to mitigate the issue by transforming the sparse data into a dense representation. The transformed data is expected to facilitate the gradient calculation and finetuned process in a DL training process. The dataset, example of a sparse dataset, contained 362 specialty coffee odorant spectra detected from GC-MS. The BPL layer was placed at the beginning of the DL model. The tunable parameters in the layer were learnable projected axes that were the bases of a new representation space. The layer rotated these bases when its parameters were updated. When the number of the bases was the same as the original dimension, the increasing percentage of the F1 scores was 8.56%. Furthermore, when the number was set as 768 (the original dimension was 490), the increasing percentage of the F1 score was 11.49%. The layer not only maintained the model performance and even constructed a better representation space in analyzing sparse datasets. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 5 pages, 2 figures, 2 tables, conference

MSC Class: 68-06 ACM Class: I.2.4; J.2

arXiv:2403.08904 [pdf, other]

Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics

Authors: Tyler A. Chang, Katrin Tomanek, Jessica Hoffmann, Nithum Thain, Erin van Liemt, Kathleen Meier-Hellstern, Lucas Dixon

Abstract: We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives. We frame this as retrieval augmented generation, where perspectives are retrieved from a knowledge base and the LLM is tasked with generating a fluent and faithful response from the… ▽ More We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives. We frame this as retrieval augmented generation, where perspectives are retrieved from a knowledge base and the LLM is tasked with generating a fluent and faithful response from the given perspectives. As a starting point, we use a deterministic retrieval system and then focus on common LLM failure modes that arise during this approach to text generation, namely hallucination and coverage errors. We propose and evaluate three methods to detect such errors based on (1) word-overlap, (2) salience, and (3) LLM-based classifiers. Our results demonstrate that LLM-based classifiers, even when trained only on synthetic errors, achieve high error detection performance, with ROC AUC scores of 95.3% for hallucination and 90.5% for coverage error detection on unambiguous error cases. We show that when no training data is available, our other methods still yield good results on hallucination (84.0%) and coverage error (85.2%) detection. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: Accepted at LREC-COLING 2024

arXiv:2403.08553 [pdf, other]

Regret Analysis of Policy Optimization over Submanifolds for Linearly Constrained Online LQG

Authors: Ting-Jui Chang, Shahin Shahrampour

Abstract: Recent advancement in online optimization and control has provided novel tools to study online linear quadratic regulator (LQR) problems, where cost matrices are varying adversarially over time. However, the controller parameterization of existing works may not satisfy practical conditions like sparsity due to physical connections. In this work, we study online linear quadratic Gaussian problems w… ▽ More Recent advancement in online optimization and control has provided novel tools to study online linear quadratic regulator (LQR) problems, where cost matrices are varying adversarially over time. However, the controller parameterization of existing works may not satisfy practical conditions like sparsity due to physical connections. In this work, we study online linear quadratic Gaussian problems with a given linear constraint imposed on the controller. Inspired by the recent work of [1] which proposed, for a linearly constrained policy optimization of an offline LQR, a second order method equipped with a Riemannian metric that emerges naturally in the context of optimal control problems, we propose online optimistic Newton on manifold (OONM) which provides an online controller based on the prediction on the first and second order information of the function sequence. To quantify the proposed algorithm, we leverage the notion of regret defined as the sub-optimality of its cumulative cost to that of a (locally) minimizing controller sequence and provide the regret bound in terms of the path-length of the minimizer sequence. Simulation results are also provided to verify the property of OONM. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.07205 [pdf, ps, other]

Asymptotic properties of the Stokes flow in an exterior domain with slowly decaying initial data and its application to the Navier-Stokes equations

Authors: Tongkeun Chang, Bum Ja Jin

Abstract: In this paper, we study the decay rate of the Stokes flow in an exterior domain with a slowly decaying initial data ${\bf u}_0(x)=O(|x|^{-\al}), 0<\al\leq n$. %which is not $L^1$ integrable. As an application we find the unique strong solution of the Navier-Stokes equations corresponding to a slowly decaying initial data. We also derive the pointwise decay estimate of the Navier-Stokes flow. Our d… ▽ More In this paper, we study the decay rate of the Stokes flow in an exterior domain with a slowly decaying initial data ${\bf u}_0(x)=O(|x|^{-\al}), 0<\al\leq n$. %which is not $L^1$ integrable. As an application we find the unique strong solution of the Navier-Stokes equations corresponding to a slowly decaying initial data. We also derive the pointwise decay estimate of the Navier-Stokes flow. Our decay rates will be optimal compared with the decay rates of the heat flow. △ Less

Submitted 15 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.00686 [pdf, other]

A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages

Authors: Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen

Abstract: How should text dataset sizes be compared across languages? Even for content-matched (parallel) corpora, UTF-8 encoded text can require a dramatically different number of bytes for different languages. In our work, we define the byte premium between two languages as the ratio of bytes used to encode content-matched text in those languages. We compute byte premiums for 1155 languages, and we use li… ▽ More How should text dataset sizes be compared across languages? Even for content-matched (parallel) corpora, UTF-8 encoded text can require a dramatically different number of bytes for different languages. In our work, we define the byte premium between two languages as the ratio of bytes used to encode content-matched text in those languages. We compute byte premiums for 1155 languages, and we use linear regressions to estimate byte premiums for other languages. We release a tool to obtain byte premiums for any two languages, enabling comparisons of dataset sizes across languages for more equitable multilingual model development and data practices. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2403.00101 [pdf, other]

doi 10.1051/0004-6361/202348159

Fires in the deep: The luminosity distribution of early-time gamma-ray-burst afterglows in light of the Gamow Explorer sensitivity requirements

Authors: D. A. Kann, N. E. White, G. Ghirlanda, S. R. Oates, A. Melandri, M. Jelinek, A. de Ugarte Postigo, A. J. Levan, A. Martin-Carrillo, G. S. -H. Paek, L. Izzo, M. Blazek, C. Thone, J. F. Agui Fernandez, R. Salvaterra, N. R. Tanvir, T. -C. Chang, P. O'Brien, A. Rossi, D. A. Perley, M. Im, D. B. Malesani, A. Antonelli, S. Covino, C. Choi , et al. (36 additional authors not shown)

Abstract: Gamma-ray bursts (GRBs) are ideal probes of the Universe at high redshift (z > 5), pinpointing the locations of the earliest star-forming galaxies and providing bright backlights that can be used to spectrally fingerprint the intergalactic medium and host galaxy during the period of reionization. Future missions such as Gamow Explorer are being proposed to unlock this potential by increasing the r… ▽ More Gamma-ray bursts (GRBs) are ideal probes of the Universe at high redshift (z > 5), pinpointing the locations of the earliest star-forming galaxies and providing bright backlights that can be used to spectrally fingerprint the intergalactic medium and host galaxy during the period of reionization. Future missions such as Gamow Explorer are being proposed to unlock this potential by increasing the rate of identification of high-z GRBs to rapidly trigger observations from 6-10 m ground telescopes, JWST, and the Extremely Large Telescopes. Gamow was proposed to the NASA 2021 Medium-Class Explorer (MIDEX) program as a fast-slewing satellite featuring a wide-field lobster-eye X-ray telescope (LEXT) to detect and localize GRBs, and a 30 cm narrow-field multi-channel photo-z infrared telescope (PIRT) to measure their photometric redshifts using the Lyman-alpha dropout technique. To derive the PIRT sensitivity requirement we compiled a complete sample of GRB optical-near-infrared afterglows from 2008 to 2021, adding a total of 66 new afterglows to our earlier sample, including all known high-z GRB afterglows. We performed full light-curve and spectral-energy-distribution analyses of these afterglows to derive their true luminosity at very early times. For all the light curves, where possible, we determined the brightness at the time of the initial finding chart of Gamow, at different high redshifts and in different NIR bands. We then followed the evolution of the luminosity to predict requirements for ground and space-based follow-up. We find that a PIRT sensitivity of 15 micro-Jy (21 mag AB) in a 500 s exposure simultaneously in five NIR bands within 1000s of the GRB trigger will meet the Gamow mission requirement to recover > 80% of all redshifts at z > 5. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 44 pages, 10 figures, 1 table. Accepted for publication in Astronomy and Astrophysics 15 Feb 2024. Abstract abridged for arXiv

Journal ref: A&A 686, A56 (2024)

arXiv:2402.18893 [pdf]

Direct Visualization of Disorder Driven Electronic Liquid Crystal Phases in Dirac Nodal Line Semimetal GdSbTe

Authors: Balaji Venkatesan, Syu-You Guan, Jen-Te Chang, Shiang-Bin Chiu, Po-Yuan Yang, Chih-Chuan Su, Tay-Rong Chang, Kalaivanan Raju, Raman Sankar, Somboon Fongchaiya, Ming-Wen Chu, Chia-Seng Chang, Guoqing Chang, Hsin Lin, Adrian Del Maestro, Ying-Jer Kao, Tien-Ming Chuang

Abstract: Electronic liquid crystal (ELC) phases are spontaneous symmetry breaking states believed to arise from strong electron correlation in quantum materials such as cuprates and iron pnictides. Here, we report a direct observation of ELC phases in a Dirac nodal line (DNL) semimetal GdSbxTe2-x. Electronic nanostructures consisting of incommensurate smectic charge modulation and intense local nematic ord… ▽ More Electronic liquid crystal (ELC) phases are spontaneous symmetry breaking states believed to arise from strong electron correlation in quantum materials such as cuprates and iron pnictides. Here, we report a direct observation of ELC phases in a Dirac nodal line (DNL) semimetal GdSbxTe2-x. Electronic nanostructures consisting of incommensurate smectic charge modulation and intense local nematic order are visualized by using spectroscopic imaging - scanning tunneling microscopy. As topological materials with symmetry protected Dirac or Weyl fermions are mostly weakly correlated, the discovery of such ELC phases are anomalous and raise questions on the origin of their emergence. Specifically, we demonstrate how chemical substitution generates these symmetry breaking phases before the system undergoes a charge density wave - orthorhombic structural transition. We further show how dopants can induce nematicity via quasiparticle scattering interference. Our results highlight the importance of impurities in realizing ELC phases and present a new material platform for exploring the interplay among quenched disorder, topology and electron correlation. △ Less

Submitted 7 May, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.09970 [pdf, other]

Accelerating Parallel Sampling of Diffusion Models

Authors: Zhiwei Tang, Jiasheng Tang, Hao Luo, Fan Wang, Tsung-Hui Chang

Abstract: Diffusion models have emerged as state-of-the-art generative models for image generation. However, sampling from diffusion models is usually time-consuming due to the inherent autoregressive nature of their sampling process. In this work, we propose a novel approach that accelerates the sampling of diffusion models by parallelizing the autoregressive process. Specifically, we reformulate the sampl… ▽ More Diffusion models have emerged as state-of-the-art generative models for image generation. However, sampling from diffusion models is usually time-consuming due to the inherent autoregressive nature of their sampling process. In this work, we propose a novel approach that accelerates the sampling of diffusion models by parallelizing the autoregressive process. Specifically, we reformulate the sampling process as solving a system of triangular nonlinear equations through fixed-point iteration. With this innovative formulation, we explore several systematic techniques to further reduce the iteration steps required by the solving process. Applying these techniques, we introduce ParaTAA, a universal and training-free parallel sampling algorithm that can leverage extra computational and memory resources to increase the sampling speed. Our experiments demonstrate that ParaTAA can decrease the inference steps required by common sequential sampling algorithms such as DDIM and DDPM by a factor of 4$\sim$14 times. Notably, when applying ParaTAA with 100 steps DDIM for Stable Diffusion, a widely-used text-to-image diffusion model, it can produce the same images as the sequential sampling in only 7 inference steps. The code is available at https://github.com/TZW1998/ParaTAA-Diffusion. △ Less

Submitted 27 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: ICML 2024

arXiv:2402.09941 [pdf, other]

FedLion: Faster Adaptive Federated Optimization with Fewer Communication

Authors: Zhiwei Tang, Tsung-Hui Chang

Abstract: In Federated Learning (FL), a framework to train machine learning models across distributed data, well-known algorithms like FedAvg tend to have slow convergence rates, resulting in high communication costs during training. To address this challenge, we introduce FedLion, an adaptive federated optimization algorithm that seamlessly incorporates key elements from the recently proposed centralized a… ▽ More In Federated Learning (FL), a framework to train machine learning models across distributed data, well-known algorithms like FedAvg tend to have slow convergence rates, resulting in high communication costs during training. To address this challenge, we introduce FedLion, an adaptive federated optimization algorithm that seamlessly incorporates key elements from the recently proposed centralized adaptive algorithm, Lion (Chen et al. 2o23), into the FL framework. Through comprehensive evaluations on two widely adopted FL benchmarks, we demonstrate that FedLion outperforms previous state-of-the-art adaptive algorithms, including FAFED (Wu et al. 2023) and FedDA. Moreover, thanks to the use of signed gradients in local training, FedLion substantially reduces data transmission requirements during uplink communication when compared to existing adaptive algorithms, further reducing communication costs. Last but not least, this work also includes a novel theoretical analysis, showcasing that FedLion attains faster convergence rate than established FL algorithms like FedAvg. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: ICASSP 2024

arXiv:2402.03829 [pdf, ps, other]

Precise Measurement of Born Cross Sections for $e^+e^-\to D\bar{D}$ and Observation of One Structure between $\sqrt{s} = 3.80-4.95$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (604 additional authors not shown)

Abstract: Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections… ▽ More Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections for both processes, one possible new structure around 3.9 GeV/$c^2$ is observed for the first time, in addition to seven known resonances $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $Y(4230)$, $Y(4360)$, $ψ(4415)$, and $Y(4660)$. These results offer crucial experimental insights into the nature of hadron production in the open charm region. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 9 pages, 4 figures, 1 tables, 1 Supplemental_Material

arXiv:2402.02323 [pdf]

Infrared Optical Anisotropy in Quasi-1D Hexagonal Chalcogenide BaTiSe3

Authors: Boyang Zhao, Hongyan Mei, Zhengyu Du, Shantanu Singh, Tieyan Chang, Jiaheng Li, Nicholas S. Settineri, Simon J. Teat, Yu-Sheng Chen, Stephen B. Cronin, Mikhail A. Kats, Jayakanth Ravichandran

Abstract: Polarimetric infrared detection bolsters IR thermography by leveraging the polarization of light. Optical anisotropy, i.e., birefringence and dichroism, can be leveraged to achieve polarimetric detection. Recently, giant optical anisotropy was discovered in quasi-1D narrow-bandgap hexagonal perovskite sulfides, A1+xTiS3, specifically BaTiS3[1,2] and Sr9/8TiS3[3,4]. In these materials, the critical… ▽ More Polarimetric infrared detection bolsters IR thermography by leveraging the polarization of light. Optical anisotropy, i.e., birefringence and dichroism, can be leveraged to achieve polarimetric detection. Recently, giant optical anisotropy was discovered in quasi-1D narrow-bandgap hexagonal perovskite sulfides, A1+xTiS3, specifically BaTiS3[1,2] and Sr9/8TiS3[3,4]. In these materials, the critical role of atomic-scale structure modulations[4,5] in the unconventional electrical[5,6], optical[7,8], and thermal[7,9] properties raises the broader question of other materials that belong to this family. To address this issue, for the first time, we synthesized high-quality single crystals of a largely unexplored member of the A1+xTiX3 (X = S, Se) family, BaTiSe3. Single-crystal X-ray diffraction determined the room-temperature structure with the P31c space group, which is a superstructure of the earlier reported[10] P63/mmc structure. The crystal structure of BaTiSe3 features antiparallel c-axis displacements similar to BaTiS3,[2] but is of lower symmetry. Polarization-resolved Raman and Fourier transform infrared (FTIR) spectroscopy were used to characterize the optical anisotropy of BaTiSe3, whose refractive index along the ordinary (perpendicular to c) and extraordinary (parallel to c) optical axes was quantitatively determined by combining ellipsometry studies with FTIR. With a giant birefringence Δn~0.9, BaTiSe3 emerges as a new candidate for miniaturized birefringent optics for mid-wave infrared to long-wave infrared imaging. △ Less

Submitted 3 February, 2024; originally announced February 2024.

arXiv:2401.15484 [pdf, other]

R$\times$R: Rapid eXploration for Reinforcement Learning via Sampling-based Reset Distributions and Imitation Pre-training

Authors: Gagan Khandate, Tristan L. Saidi, Siqi Shang, Eric T. Chang, Yang Liu, Seth Dennis, Johnson Adams, Matei Ciocarlie

Abstract: We present a method for enabling Reinforcement Learning of motor control policies for complex skills such as dexterous manipulation. We posit that a key difficulty for training such policies is the difficulty of exploring the problem state space, as the accessible and useful regions of this space form a complex structure along manifolds of the original high-dimensional state space. This work prese… ▽ More We present a method for enabling Reinforcement Learning of motor control policies for complex skills such as dexterous manipulation. We posit that a key difficulty for training such policies is the difficulty of exploring the problem state space, as the accessible and useful regions of this space form a complex structure along manifolds of the original high-dimensional state space. This work presents a method to enable and support exploration with Sampling-based Planning. We use a generally applicable non-holonomic Rapidly-exploring Random Trees algorithm and present multiple methods to use the resulting structure to bootstrap model-free Reinforcement Learning. Our method is effective at learning various challenging dexterous motor control skills of higher difficulty than previously shown. In particular, we achieve dexterous in-hand manipulation of complex objects while simultaneously securing the object without the use of passive support surfaces. These policies also transfer effectively to real robots. A number of example videos can also be found on the project website: https://sbrl.cs.columbia.edu △ Less

Submitted 27 January, 2024; originally announced January 2024.

Comments: 20 pages, 14 figures, submitted to Autonomous Robots, RSS 2023 Special Issue. arXiv admin note: substantial text overlap with arXiv:2303.03486

arXiv:2401.14720 [pdf, ps, other]

Observation of structures in the processes $e^+e^-\rightarrowωχ_{c1}$ and $ωχ_{c2}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (608 additional authors not shown)

Abstract: We present measurements of the Born cross sections for the processes $e^+e^-\rightarrowωχ_{c1}$ and $ωχ_{c2}$ at center-of-mass energies $\sqrt{s}$ from 4.308 to 4.951 GeV. The measurements are performed with data samples corresponding to an integrated luminosity of 11.0 $\rm{fb}^{-1}$ collected with the BESIII detector operating at the BEPCII storage ring. Assuming the $e^+e^-\rightarrowωχ_{c2}$… ▽ More We present measurements of the Born cross sections for the processes $e^+e^-\rightarrowωχ_{c1}$ and $ωχ_{c2}$ at center-of-mass energies $\sqrt{s}$ from 4.308 to 4.951 GeV. The measurements are performed with data samples corresponding to an integrated luminosity of 11.0 $\rm{fb}^{-1}$ collected with the BESIII detector operating at the BEPCII storage ring. Assuming the $e^+e^-\rightarrowωχ_{c2}$ signals come from a single resonance, the mass and width are determined to be $M=(4413.6\pm9.0\pm0.8)$ MeV/$c^2$ and $Γ=(110.5\pm15.0\pm2.9)$ MeV, respectively, which is consistent with the parameters of the well-established resonance $ψ(4415)$. In addition, we also use one single resonance to describe the $e^+e^-\rightarrowωχ_{c1}$ lineshape, and determine the mass and width to be $M=(4544.2\pm18.7\pm1.7)$ MeV/$c^2$ and $Γ=(116.1\pm33.5\pm1.7)$ MeV, respectively. The structure of this lineshape, observed for the first time, requires further understanding. △ Less

Submitted 24 March, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

Comments: 11 pages, 8 figures, with Supplemental Material

arXiv:2401.14711 [pdf, other]

Study of $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ at $\sqrt{s}$ from 2.00 to 3.08 GeV at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (608 additional authors not shown)

Abstract: With the data samples taken at center-of-mass energies from 2.00 to 3.08 GeV with the BESIII detector at the BEPCII collider, a partial wave analysis on the $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ process is performed. The Born cross sections for $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ and its intermediate processes $e^{+}e^{-}\rightarrowρπ$ and $ρ(1450)π$ are measured as functions of $\sqrt{s}$. Th… ▽ More With the data samples taken at center-of-mass energies from 2.00 to 3.08 GeV with the BESIII detector at the BEPCII collider, a partial wave analysis on the $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ process is performed. The Born cross sections for $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ and its intermediate processes $e^{+}e^{-}\rightarrowρπ$ and $ρ(1450)π$ are measured as functions of $\sqrt{s}$. The results for $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ are consistent with previous results measured with the initial state radiation method within one standard deviation, and improve the uncertainty by a factor of ten. By fitting the line shapes of the Born cross sections for the $e^{+}e^{-}\rightarrowρπ$ and $ρ(1450)π$, a structure with mass $M = 2119\pm11\pm15\ {\rm MeV}/c^2$ and width $Γ=69\pm30\pm5 {\rm MeV}$ is observed with a significance of $5.9σ$, where the first uncertainties are statistical and the second ones are systematic. This structure can be intepreteted as an excited $ω$ state. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.12025 [pdf, other]

A Survey of Recent Advances in Optimization Methods for Wireless Communications

Authors: Ya-Feng Liu, Tsung-Hui Chang, Mingyi Hong, Zheyu Wu, Anthony Man-Cho So, Eduard A. Jorswieck, Wei Yu

Abstract: Mathematical optimization is now widely regarded as an indispensable modeling and solution tool for the design of wireless communications systems. While optimization has played a significant role in the revolutionary progress in wireless communication and networking technologies from 1G to 5G and onto the future 6G, the innovations in wireless technologies have also substantially transformed the n… ▽ More Mathematical optimization is now widely regarded as an indispensable modeling and solution tool for the design of wireless communications systems. While optimization has played a significant role in the revolutionary progress in wireless communication and networking technologies from 1G to 5G and onto the future 6G, the innovations in wireless technologies have also substantially transformed the nature of the underlying mathematical optimization problems upon which the system designs are based and have sparked significant innovations in the development of methodologies to understand, to analyze, and to solve those problems. In this paper, we provide a comprehensive survey of recent advances in mathematical optimization theory and algorithms for wireless communication system design. We begin by illustrating common features of mathematical optimization problems arising in wireless communication system design. We discuss various scenarios and use cases and their associated mathematical structures from an optimization perspective. We then provide an overview of recently developed optimization techniques in areas ranging from nonconvex optimization, global optimization, and integer programming, to distributed optimization and learning-based optimization. The key to successful solution of mathematical optimization problems is in carefully choosing or developing suitable algorithms (or neural network architectures) that can exploit the underlying problem structure. We conclude the paper by identifying several open research challenges and outlining future research directions. △ Less

Submitted 7 June, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

Comments: 39 pages, 5 figures, accepted for publication in IEEE Journal on Selected Areas in Communications

arXiv:2401.06164 [pdf, other]

Multimodal Gen-AI for Fundamental Investment Research

Authors: Lezhi Li, Ting-Yu Chang, Hai Wang

Abstract: This report outlines a transformative initiative in the financial investment industry, where the conventional decision-making process, laden with labor-intensive tasks such as sifting through voluminous documents, is being reimagined. Leveraging language models, our experiments aim to automate information summarization and investment idea generation. We seek to evaluate the effectiveness of fine-t… ▽ More This report outlines a transformative initiative in the financial investment industry, where the conventional decision-making process, laden with labor-intensive tasks such as sifting through voluminous documents, is being reimagined. Leveraging language models, our experiments aim to automate information summarization and investment idea generation. We seek to evaluate the effectiveness of fine-tuning methods on a base model (Llama2) to achieve specific application-level goals, including providing insights into the impact of events on companies and sectors, understanding market condition relationships, generating investor-aligned investment ideas, and formatting results with stock recommendations and detailed explanations. Through state-of-the-art generative modeling techniques, the ultimate objective is to develop an AI agent prototype, liberating human investors from repetitive tasks and allowing a focus on high-level strategic thinking. The project encompasses a diverse corpus dataset, including research reports, investment memos, market news, and extensive time-series market data. We conducted three experiments applying unsupervised and supervised LoRA fine-tuning on the llama2_7b_hf_chat as the base model, as well as instruction fine-tuning on the GPT3.5 model. Statistical and human evaluations both show that the fine-tuned versions perform better in solving text modeling, summarization, reasoning, and finance domain questions, demonstrating a pivotal step towards enhancing decision-making processes in the financial domain. Code implementation for the project can be found on GitHub: https://github.com/Firenze11/finance_lm. △ Less

Submitted 23 December, 2023; originally announced January 2024.

arXiv:2401.05691 [pdf, ps, other]

doi 10.1039/D4MH00165F

Atomic Scale Quantum Anomalous Hall Effect in Monolayer Graphene/$\rm MnBi_{2}Te_{4}$ Heterostructure

Authors: Yueh-Ting Yao, Suyang Xu, Tay-Rong Chang

Abstract: The two-dimensional quantum anomalous Hall (QAH) effect is direct evidence of non-trivial Berry curvature topology in condensed matter physics. Searching for QAH in 2D materials, particularly with simplified fabrication methods, poses a significant challenge in future applications. Despite numerous theoretical works proposed for the QAH effect with $C=2$ in graphene, neglecting magnetism sources s… ▽ More The two-dimensional quantum anomalous Hall (QAH) effect is direct evidence of non-trivial Berry curvature topology in condensed matter physics. Searching for QAH in 2D materials, particularly with simplified fabrication methods, poses a significant challenge in future applications. Despite numerous theoretical works proposed for the QAH effect with $C=2$ in graphene, neglecting magnetism sources such as proper substrate effects remain experimental evidence absent. In this work, we propose the QAH effect in graphene/$\rm MnBi_{2}Te_{4}$ (MBT) heterostructure based on density-functional theory (DFT). The monolayer MBT introduces spin-orbital coupling, Zeeman exchange field, and Kekul$\rm \acute{e}$ distortion as a substrate effect into graphene, resulting in QAH with $C=1$ in the heterostructure. Our effective Hamiltonian further presents a rich phase diagram that has not been studied previously. Our work provides a new and practical way to explore the QAH effect in monolayer graphene and the magnetic topological phases by the flexibility of MBT family materials. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.04671 [pdf, other]

doi 10.1103/PhysRevB.109.144101

Jahn-Teller driven quadrupolar ordering and spin-orbital dimer formation in GaNb$_{4}$Se$_{8}$

Authors: Tsung-Han Yang, Tieyan Chang, Yu-Sheng Chen, K. W. Plumb

Abstract: The lacunar spinel GaNb$_4$Se$_8$ is a tetrahedral cluster Mott insulator where spin-orbit coupling on molecular orbitals and Jahn-Teller energy scales are competitive. GaNb$_4$Se$_8$ undergoes a structural and anti-polar ordering transition at T$_Q$ = 50 K that corresponds to a quadrupolar ordering of molecular orbitals on Nb$_4$ clusters. A second transition occurs at T$_M$ = 29 K, where local d… ▽ More The lacunar spinel GaNb$_4$Se$_8$ is a tetrahedral cluster Mott insulator where spin-orbit coupling on molecular orbitals and Jahn-Teller energy scales are competitive. GaNb$_4$Se$_8$ undergoes a structural and anti-polar ordering transition at T$_Q$ = 50 K that corresponds to a quadrupolar ordering of molecular orbitals on Nb$_4$ clusters. A second transition occurs at T$_M$ = 29 K, where local distortions on the Nb$_4$ clusters rearrange. We present a single crystal x-ray diffraction investigation these phase transitions and solve the crystal structure in the intermediate T$_M$ < T < T$_Q$ and low T < T$_M$ temperature phases. The intermediate phase is a primitive cubic P2$_1$3 structure with a staggered arrangement of Nb4 cluster distortions. A symmetry mode analysis reveals that the transition at TQ is continuous and described by a single Jahn-Teller active amplitude mode. In the low temperature phase, the symmetry of Nb$_4$ clusters is further reduced and the unit cell doubles into an orthorhombic P2$_1$2$_1$2$_1$ space group. Nb$_4$ clusters rearrange through this transition to form a staggered arrangement of intercluster dimers, suggesting a valence bond solid magnetic state. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Journal ref: Phys. Rev. B 109, 144101 (2024)

arXiv:2401.04204 [pdf, other]

Effects of Bursty Star Formation on [C II] Line Intensity Mapping of High-redshift Galaxies

Authors: Lun-Jun Liu, Guochao Sun, Tzu-Ching Chang, Steven R. Furlanetto, Charles M. Bradford

Abstract: Bursty star formation -- a key prediction for high-redshift galaxies from cosmological simulations explicitly resolving stellar feedback in the interstellar medium -- has recently been observed to prevail among galaxies at redshift $z \gtrsim 6$. Line intensity mapping (LIM) of the 158 $μ$m [C II] line as a star formation rate indicator offers unique opportunities to tomographically constrain cosm… ▽ More Bursty star formation -- a key prediction for high-redshift galaxies from cosmological simulations explicitly resolving stellar feedback in the interstellar medium -- has recently been observed to prevail among galaxies at redshift $z \gtrsim 6$. Line intensity mapping (LIM) of the 158 $μ$m [C II] line as a star formation rate indicator offers unique opportunities to tomographically constrain cosmic star formation at high redshift, as an alternative to observations of individually detected galaxies. To understand effects of bursty star formation on [C II] LIM, which remain unexplored in previous studies, we present an analytic modeling framework for high-$z$ galaxy formation and [C II] LIM signals that accounts for bursty star formation histories induced by delayed supernova feedback. We use it to explore and characterize how bursty star formation can impact and thus complicate the interpretation of the [C II] luminosity function and power spectrum. Our simple analytic model indicates that bursty star formation is most important for low halo masses, and in the power spectrum it can create a substantial excess in the large-scale clustering term. This distortion results in a power spectrum shape which cannot be explained by invoking a mass-independent scatter. We conclude that burstiness must be accounted for when modeling and analyzing [C II] datasets from the early universe, and that in the extreme, the signature of burstiness may be detectable with first-generation experiments such as TIME, CONCERTO, and CCAT-DSS. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 12 pages, 4 figures, submitted to AAS journals; comments welcome

arXiv:2401.00878 [pdf, ps, other]

Observation of $\mathcal R(3810)$ in $e^+e^-\rightarrow {\rm hadrons}$ and Improved Measurements of the Resonance Parameters of $\mathcal R(3760)$ and $\mathcal R(3780)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (596 additional authors not shown)

Abstract: We report the measurement of the cross sections for $e^+e^-\rightarrow {\rm hadrons}$ at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe a new resonance $\mathcal R(3810)$ in the cross sections for the first time, and observe the $\mathcal R(3760)$ resonance with high significance in the cross sections. The $\mathcal R(3810)$ has a mass of $(3804.5 \pm 0.9 \pm 0.9)$ ~MeV/$c^2$,… ▽ More We report the measurement of the cross sections for $e^+e^-\rightarrow {\rm hadrons}$ at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe a new resonance $\mathcal R(3810)$ in the cross sections for the first time, and observe the $\mathcal R(3760)$ resonance with high significance in the cross sections. The $\mathcal R(3810)$ has a mass of $(3804.5 \pm 0.9 \pm 0.9)$ ~MeV/$c^2$, a total width of $(5.4 \pm 3.5 \pm 3.2)$~MeV, and an electronic partial width of $(19.4 \pm 7.4 \pm 12.1)$~eV. Its significance is $7.7σ$. The $\mathcal R(3810)$ could be interpreted as a hadro-charmonium resonance predicted by Quantum Chromodynamics (QCD). In addition, we measure the mass $(3751.9\pm 3.8\pm 2.8)$ ~MeV/$c^2$, the total width $(32.8 \pm 5.8 \pm 8.7)$~MeV, and the electronic partial width $(184\pm 75\pm 86)$~eV with improved precision for the $\mathcal R(3760)$. Furthermore, for the $\mathcal R(3780)$ we measure the mass $(3778.7\pm 0.5\pm 0.3)$ ~MeV/$c^2$ and total width $(20.3 \pm 0.8 \pm 1.7)$~MeV with improved precision, and the electronic partial width $(265\pm 69\pm 83)$~eV. The $\mathcal R(3780)$ can be interpreted as the $1^3D_1$ state of charmonium. Its mass and total width differ significantly from the corresponding fitted values given by the Particle Data Group in 2022 by 7.1 and 3.2 times the uncertainties for $ψ(3770)$, respectively. $ψ(3770)$ has been interpreted as the $1^3D_1$ state for 45 years. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2312.14318 [pdf, other]

Trapped atoms and superradiance on an integrated nanophotonic microring circuit

Authors: Xinchao Zhou, Hikaru Tamura, Tzu-Han Chang, Chen-Lung Hung

Abstract: Interfacing cold atoms with integrated nanophotonic devices could offer new paradigms for engineering atom-light interactions and provide a potentially scalable route for quantum sensing, metrology, and quantum information processing. However, it remains a challenging task to efficiently trap a large ensemble of cold atoms on an integrated nanophotonic circuit. Here, we demonstrate direct loading… ▽ More Interfacing cold atoms with integrated nanophotonic devices could offer new paradigms for engineering atom-light interactions and provide a potentially scalable route for quantum sensing, metrology, and quantum information processing. However, it remains a challenging task to efficiently trap a large ensemble of cold atoms on an integrated nanophotonic circuit. Here, we demonstrate direct loading of an ensemble of up to 70 atoms into an optical microtrap on a nanophotonic microring circuit. Efficient trap loading is achieved by employing degenerate Raman-sideband cooling in the microtrap, where a built-in spin-motion coupling arises directly from the vector light shift of the evanescent field potential on a microring. Atoms are cooled into the trap via optical pumping with a single free space beam. We have achieved a trap lifetime approaching 700ms under continuous cooling. We show that the trapped atoms display large cooperative coupling and superradiant decay into a whispering-gallery mode of the microring resonator, holding promise for explorations of new collective effects. Our technique can be extended to trapping a large ensemble of cold atoms on nanophotonic circuits for various quantum applications. △ Less

Submitted 21 June, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 11 pages, 9 figures

arXiv:2312.13593 [pdf, ps, other]

Search for the decay $χ_{c1}(3872)\toπ^{+}π^{-}χ_{c1}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (608 additional authors not shown)

Abstract: Using a data sample corresponding to an integrated luminosity of 10.9 fb$^{-1}$ collected at center-of-mass energies from 4.16 to 4.34 GeV with the BESIII detector, we search for the decay $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$ in the radiative production $e^{+}e^{-} \to γχ_{c1}(3872)$. No significant signal is observed, and the ratio for the branching fraction of $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$… ▽ More Using a data sample corresponding to an integrated luminosity of 10.9 fb$^{-1}$ collected at center-of-mass energies from 4.16 to 4.34 GeV with the BESIII detector, we search for the decay $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$ in the radiative production $e^{+}e^{-} \to γχ_{c1}(3872)$. No significant signal is observed, and the ratio for the branching fraction of $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$ to $χ_{c1}(3872) \to π^{+}π^{-}J/ψ$ is measured as $\mathcal{R}\equiv\frac{\mathcal{B}[χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}]}{\mathcal{B}[χ_{c1}(3872) \to π^{+}π^{-} J/ψ]}<0.18$ at 90$\%$ confidence level. The upper limit on the product of the cross section $σ[e^{+}e^{-}\toγχ_{c1}(3872)]$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^{+}π^{-}χ_{c1}]$ at each center-of-mass energy is also given. These measurements favor the non-conventional charmonium nature of the $χ_{c1}(3872)$ state. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 8 pages, 1 figure

arXiv:2312.11794 [pdf, ps, other]

doi 10.1103/PhysRevB.109.155143

Feature-energy duality of topological boundary states in multilayer quantum spin Hall insulator

Authors: Yueh-Ting Yao, Xiaoting Zhou, Yi-Chun Hung, Hsin Lin, Arun Bansil, Tay-Rong Chang

Abstract: Gapless topological boundary states characterize nontrivial topological phases arising from the bulk-boundary correspondence in symmetry-protected topological materials, such as the emergence of helical edge states in a two-dimensional $\mathbb{Z}_2$ topological insulator. However, the incorporation of symmetry-breaking perturbation terms in the Hamiltonian leads to the gapping of these edge bands… ▽ More Gapless topological boundary states characterize nontrivial topological phases arising from the bulk-boundary correspondence in symmetry-protected topological materials, such as the emergence of helical edge states in a two-dimensional $\mathbb{Z}_2$ topological insulator. However, the incorporation of symmetry-breaking perturbation terms in the Hamiltonian leads to the gapping of these edge bands, resulting in missing these crucial topological boundary states. In this work, we systematically investigate the robustness of bulk-boundary correspondence in the quantum spin Hall insulator via recently introduced feature spectrum topology. Our findings present a comprehensive understanding of feature-energy duality, illustrating that the aggregate number of gapless edge states in the energy-momentum ($\it{E-k}$) map and the non-trivial edge states in the $\hat{S}_z$ feature spectrum equals the spin Chern number of multilayer quantum spin Hall insulator. We identify a van der Waals material bismuth bromide $\rm(Bi_4Br_4)$ as a promising candidate through first-principles calculations. Our work not only unravels the intricacies of bulk-boundary correspondence but also charts a course for exploring quantum spin Hall insulators with high spin-Chern number. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.10962 [pdf, other]

Observation of significant flavor-SU(3) breaking in the kaon wave function at $12~{\rm GeV}^2<Q^2<25~{\rm GeV}^2$ and discovery of the charmless decay $ψ(3770)\to K_S^0K_L^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (607 additional authors not shown)

Abstract: We present cross sections for the reaction $e^+e^-\to K_S^0K_L^0$ at center-of-mass energies ranging from 3.51 GeV to 4.95 GeV using data samples collected in the BESIII experiment, corresponding to a total integrated luminosity of 26.5 fb$^{-1}$. The ratio of neutral-to-charged kaon form factors at large momentum transfers ($12~{\rm GeV}^2<Q^2<25~{\rm GeV}^2$) is determined to be $0.21\pm 0.01$,… ▽ More We present cross sections for the reaction $e^+e^-\to K_S^0K_L^0$ at center-of-mass energies ranging from 3.51 GeV to 4.95 GeV using data samples collected in the BESIII experiment, corresponding to a total integrated luminosity of 26.5 fb$^{-1}$. The ratio of neutral-to-charged kaon form factors at large momentum transfers ($12~{\rm GeV}^2<Q^2<25~{\rm GeV}^2$) is determined to be $0.21\pm 0.01$, which indicates a small but significant effect of flavor-SU(3) breaking in the kaon wave function, and consequently excludes the possibility that flavor-SU(3) breaking is the primary reason for the strong experimental violation of the pQCD prediction $|F(π^{\pm})|/|F(K^{\pm})|=f^2_π/f^2_{K}$, where $F(π^{\pm})$ and $F(K^{\pm})$ are the form factors, and $f_π$ and $f_{K}$ are the decay constants of charged pions and kaons, respectively. We also observe a significant signal for the charmless decay $ψ(3770)\to K_S^0K_L^0$ for the first time. Within a $1σ$ contour of the likelihood value, the the branching fraction for $ψ(3770)\to K_S^0K_L^0$ is determined to be ${\cal B}=(2.63_{-1.59}^{+1.40})\times 10^{-5}$, and the relative phase between the continuum and $ψ(3770)$ amplitudes is $φ=(-0.39_{-0.10}^{+0.05})π$. The branching fraction is in good agreement with the $\mathcal{S}$- and $\mathcal{D}$-wave charmonia mixing scheme proposed in the interpretation of the "$ρπ$ puzzle" between $J/ψ$ and $ψ(3686)$ decays. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 18 pages, 56 figures

arXiv:2312.10483 [pdf, other]

doi 10.1109/BioCAS54905.2022.9948588

All Attention U-NET for Semantic Segmentation of Intracranial Hemorrhages In Head CT Images

Authors: Chia Shuo Chang, Tian Sheuan Chang, Jiun Lin Yan, Li Ko

Abstract: Intracranial hemorrhages in head CT scans serve as a first line tool to help specialists diagnose different types. However, their types have diverse shapes in the same type but similar confusing shape, size and location between types. To solve this problem, this paper proposes an all attention U-Net. It uses channel attentions in the U-Net encoder side to enhance class specific feature extraction,… ▽ More Intracranial hemorrhages in head CT scans serve as a first line tool to help specialists diagnose different types. However, their types have diverse shapes in the same type but similar confusing shape, size and location between types. To solve this problem, this paper proposes an all attention U-Net. It uses channel attentions in the U-Net encoder side to enhance class specific feature extraction, and space and channel attentions in the U-Net decoder side for more accurate shape extraction and type classification. The simulation results show up to a 31.8\% improvement compared to baseline, ResNet50 + U-Net, and better performance than in cases with limited attention. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: 2022 IEEE Biomedical Circuits and Systems Conference (BioCAS)

Showing 1–50 of 673 results for author: Chang, T