Search | arXiv e-print repository

An Earth-sized Planet on the Verge of Tidal Disruption

Authors: Fei Dai, Andrew W. Howard, Samuel Halverson, Jaume Orell-Miquel, Enric Palle, Howard Isaacson, Benjamin Fulton, Ellen M. Price, Mykhaylo Plotnykov, Leslie A. Rogers, Diana Valencia, Kimberly Paragas, Michael Greklek-McKeon, Jonathan Gomez Barrientos, Heather A. Knutson, Erik A. Petigura, Lauren M. Weiss, Rena Lee, Casey L. Brinkman, Daniel Huber, Gudmundur Steffansson, Kento Masuda, Steven Giacalone, Cicero X. Lu, Edwin S. Kite , et al. (73 additional authors not shown)

Abstract: TOI-6255~b (GJ 4256) is an Earth-sized planet (1.079$\pm0.065$ $R_\oplus$) with an orbital period of only 5.7 hours. With the newly commissioned Keck Planet Finder (KPF) and CARMENES spectrographs, we determined the planet's mass to be 1.44$\pm$0.14 $M_{\oplus}$. The planet is just outside the Roche limit, with $P_{\rm orb}/P_{\rm Roche}$ = 1.13 $\pm0.10$. The strong tidal force likely deforms the… ▽ More TOI-6255~b (GJ 4256) is an Earth-sized planet (1.079$\pm0.065$ $R_\oplus$) with an orbital period of only 5.7 hours. With the newly commissioned Keck Planet Finder (KPF) and CARMENES spectrographs, we determined the planet's mass to be 1.44$\pm$0.14 $M_{\oplus}$. The planet is just outside the Roche limit, with $P_{\rm orb}/P_{\rm Roche}$ = 1.13 $\pm0.10$. The strong tidal force likely deforms the planet into a triaxial ellipsoid with a long axis that is $\sim$10\% longer than the short axis. Assuming a reduced stellar tidal quality factor $Q_\star^\prime \approx10^7$, we predict that tidal orbital decay will cause TOI-6255 to reach the Roche limit in roughly 400 Myr. Such tidal disruptions may produce the possible signatures of planet engulfment that have been on stars with anomalously high refractory elemental abundances compared to its conatal binary companion. TOI-6255 b is also a favorable target for searching for star-planet magnetic interactions, which might cause interior melting and hasten orbital decay. TOI-6255 b is a top target (Emission Spectroscopy Metric of about 24) for phase curve observations with the James Webb Space Telescope. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: 18 pages, 7 figures, 5 tables, accepted to AAS Journals. The first RV mass measurement from the Keck Planet Finder

arXiv:2407.17688 [pdf, other]

Examining the Influence of Political Bias on Large Language Model Performance in Stance Classification

Authors: Lynnette Hui Xian Ng, Iain Cruickshank, Roy Ka-Wei Lee

Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in executing tasks based on natural language queries. However, these models, trained on curated datasets, inherently embody biases ranging from racial to national and gender biases. It remains uncertain whether these biases impact the performance of LLMs for certain tasks. In this study, we investigate the political biases of L… ▽ More Large Language Models (LLMs) have demonstrated remarkable capabilities in executing tasks based on natural language queries. However, these models, trained on curated datasets, inherently embody biases ranging from racial to national and gender biases. It remains uncertain whether these biases impact the performance of LLMs for certain tasks. In this study, we investigate the political biases of LLMs within the stance classification task, specifically examining whether these models exhibit a tendency to more accurately classify politically-charged stances. Utilizing three datasets, seven LLMs, and four distinct prompting schemes, we analyze the performance of LLMs on politically oriented statements and targets. Our findings reveal a statistically significant difference in the performance of LLMs across various politically oriented stance classification tasks. Furthermore, we observe that this difference primarily manifests at the dataset level, with models and prompting schemes showing statistically similar performances across different stance classification datasets. Lastly, we observe that when there is greater ambiguity in the target the statement is directed towards, LLMs have poorer stance classification accuracy. Code & Dataset: http://doi.org/10.5281/zenodo.12938478 △ Less

Submitted 26 July, 2024; v1 submitted 24 July, 2024; originally announced July 2024.

Comments: Accepted at ICWSM 2025

arXiv:2407.13942 [pdf, other]

Harmful Suicide Content Detection

Authors: Kyumin Park, Myung Jae Baik, YeongJun Hwang, Yen Shin, HoJae Lee, Ruda Lee, Sang Min Lee, Je Young Hannah Sun, Ah Rah Lee, Si Yeun Yoon, Dong-ho Lee, Jihyung Moon, JinYeong Bak, Kyunghyun Cho, Jong-Woo Paik, Sungjoon Park

Abstract: Harmful suicide content on the Internet is a significant risk factor inducing suicidal thoughts and behaviors among vulnerable populations. Despite global efforts, existing resources are insufficient, specifically in high-risk regions like the Republic of Korea. Current research mainly focuses on understanding negative effects of such content or suicide risk in individuals, rather than on automati… ▽ More Harmful suicide content on the Internet is a significant risk factor inducing suicidal thoughts and behaviors among vulnerable populations. Despite global efforts, existing resources are insufficient, specifically in high-risk regions like the Republic of Korea. Current research mainly focuses on understanding negative effects of such content or suicide risk in individuals, rather than on automatically detecting the harmfulness of content. To fill this gap, we introduce a harmful suicide content detection task for classifying online suicide content into five harmfulness levels. We develop a multi-modal benchmark and a task description document in collaboration with medical professionals, and leverage large language models (LLMs) to explore efficient methods for moderating such content. Our contributions include proposing a novel detection task, a multi-modal Korean benchmark with expert annotations, and suggesting strategies using LLMs to detect illegal and harmful content. Owing to the potential harm involved, we publicize our implementations and benchmark, incorporating an ethical verification process. △ Less

Submitted 2 June, 2024; originally announced July 2024.

Comments: 30 pages, 7 figures

arXiv:2407.12882 [pdf, other]

InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification

Authors: Yujia Hu, Zhiqiang Hu, Chun-Wei Seah, Roy Ka-Wei Lee

Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in a wide range of NLP tasks. However, when it comes to authorship verification (AV) tasks, which involve determining whether two given texts share the same authorship, even advanced models like ChatGPT exhibit notable limitations. This paper introduces a novel approach, termed InstructAV, for authorship verification. This appro… ▽ More Large Language Models (LLMs) have demonstrated remarkable proficiency in a wide range of NLP tasks. However, when it comes to authorship verification (AV) tasks, which involve determining whether two given texts share the same authorship, even advanced models like ChatGPT exhibit notable limitations. This paper introduces a novel approach, termed InstructAV, for authorship verification. This approach utilizes LLMs in conjunction with a parameter-efficient fine-tuning (PEFT) method to simultaneously improve accuracy and explainability. The distinctiveness of InstructAV lies in its ability to align classification decisions with transparent and understandable explanations, representing a significant progression in the field of authorship verification. Through comprehensive experiments conducted across various datasets, InstructAV demonstrates its state-of-the-art performance on the AV task, offering high classification accuracy coupled with enhanced explanation reliability. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 50 pages, 10 figures, 4 tables

arXiv:2407.12503 [pdf, other]

Polylogarithmic functions with prescribed branching locus and linear relations between them

Authors: Roman N. Lee

Abstract: We consider the problem of finding the set of classical polylogarithmic functions $\text{Li}_n$ with branching locus determined by the solution of $p_1\cdot p_2\cdot \ldots \cdot p_n=0$, where $p_1,\ldots, p_n$ are irreducible polynomials of several variables. We present an algorithm of constructing a complete set of possible arguments of $\text{Li}_n$ functions. The corresponding Mathematica code… ▽ More We consider the problem of finding the set of classical polylogarithmic functions $\text{Li}_n$ with branching locus determined by the solution of $p_1\cdot p_2\cdot \ldots \cdot p_n=0$, where $p_1,\ldots, p_n$ are irreducible polynomials of several variables. We present an algorithm of constructing a complete set of possible arguments of $\text{Li}_n$ functions. The corresponding Mathematica code is included as ancillary file. Using this algorithm and the symbol map, we provide some examples of polylogarithmic identities. △ Less

Submitted 17 July, 2024; originally announced July 2024.

Comments: 7 pages

arXiv:2407.09105 [pdf, other]

Enhancing Training Efficiency Using Packing with Flash Attention

Authors: Achintya Kundu, Rhui Dih Lee, Laura Wynter, Raghu Kiran Ganti, Mayank Mishra

Abstract: Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. Hugging Face SFT trainer has always offered the option to use packing to combin… ▽ More Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. Hugging Face SFT trainer has always offered the option to use packing to combine multiple training examples, allowing for maximal utilization of GPU resources. However, up till now, it did not offer proper masking of each packed training example. This capability has now been added to Hugging Face Transformers 4.43. We analyse this new feature and show the benefits across different variations of packing. △ Less

Submitted 29 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.08586 [pdf, other]

Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, C. Aidala, N. N. Ajitanand, Y. Akiba, R. Akimoto, H. Al-Ta'ani, J. Alexander, A. Angerami, K. Aoki, N. Apadula, Y. Aramaki, H. Asano, E. C. Aschenauer, E. T. Atomssa, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, B. Bannier, K. N. Barish, B. Bassalleck, S. Bathe , et al. (377 additional authors not shown)

Abstract: The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability… ▽ More The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 401 authors from 75 institutions, 20 pages, 15 figures, 2 tables. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2407.06362 [pdf, other]

doi 10.1039/D4MH00584H

Self-deployable contracting-cord metamaterials with tunable mechanical properties

Authors: Wenzhong Yan, Talmage Jones, Christopher L. Jawetz, Ryan H. Lee, Jonathan B. Hopkins, Ankur Mehta

Abstract: Recent advances in active materials and fabrication techniques have enabled the production of cyclically self-deployable metamaterials with an expanded functionality space. However, designing metamaterials that possess continuously tunable mechanical properties after self-deployment remains a challenge, notwithstanding its importance. Inspired by push puppets, we introduce an efficient design stra… ▽ More Recent advances in active materials and fabrication techniques have enabled the production of cyclically self-deployable metamaterials with an expanded functionality space. However, designing metamaterials that possess continuously tunable mechanical properties after self-deployment remains a challenge, notwithstanding its importance. Inspired by push puppets, we introduce an efficient design strategy to create reversibly self-deployable metamaterials with continuously tunable post-deployment stiffness and damping. Our metamaterial comprises contracting actuators threaded through beads with matching conical concavo-convex interfaces in networked chains. The slack network conforms to arbitrary shapes, but when actuated, it self-assembles into a preprogrammed configuration with beads gathered together. Further contraction of the actuators can dynamically tune the assembly's mechanical properties through the beads' particle jamming, while maintaining the overall structure with minimal change. We show that, after deployment, such metamaterials exhibit pronounced tunability in bending-dominated configurations: they can become more than 35 times stiffer and change their damping capability by over 50%. Through systematic analysis, we find that the beads'conical angle can introduce geometric nonlinearity, which has a major effect on the self-deployability and tunability of the metamaterial. Our work provides routes towards reversibly self-deployable, lightweight, and tunable metamaterials, with potential applications in soft robotics, reconfigurable architectures, and space engineering. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 6 figures

Journal ref: Materials Horizons (2024)

arXiv:2406.17294 [pdf, other]

Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models

Authors: Wenhao Shi, Zhiqiang Hu, Yi Bin, Junhua Liu, Yang Yang, See-Kiong Ng, Lidong Bing, Roy Ka-Wei Lee

Abstract: Large language models (LLMs) have demonstrated impressive reasoning capabilities, particularly in textual mathematical problem-solving. However, existing open-source image instruction fine-tuning datasets, containing limited question-answer pairs per image, do not fully exploit visual information to enhance the multimodal mathematical reasoning capabilities of Multimodal LLMs (MLLMs). To bridge th… ▽ More Large language models (LLMs) have demonstrated impressive reasoning capabilities, particularly in textual mathematical problem-solving. However, existing open-source image instruction fine-tuning datasets, containing limited question-answer pairs per image, do not fully exploit visual information to enhance the multimodal mathematical reasoning capabilities of Multimodal LLMs (MLLMs). To bridge this gap, we address the lack of high-quality, diverse multimodal mathematical datasets by collecting 40K high-quality images with question-answer pairs from 24 existing datasets and synthesizing 320K new pairs, creating the MathV360K dataset, which enhances both the breadth and depth of multimodal mathematical questions. We introduce Math-LLaVA, a LLaVA-1.5-based model fine-tuned with MathV360K. This novel approach significantly improves the multimodal mathematical reasoning capabilities of LLaVA-1.5, achieving a 19-point increase and comparable performance to GPT-4V on MathVista's minitest split. Furthermore, Math-LLaVA demonstrates enhanced generalizability, showing substantial improvements on the MMMU benchmark. Our research highlights the importance of dataset diversity and synthesis in advancing MLLMs' mathematical reasoning abilities. The code and data are available at: \url{https://github.com/HZQ950419/Math-LLaVA}. △ Less

Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: 8 pages

arXiv:2406.12223 [pdf, other]

ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations

Authors: Yunze Xiao, Yujia Hu, Kenny Tsu Wei Choo, Roy Ka-wei Lee

Abstract: Detecting hate speech and offensive language is essential for maintaining a safe and respectful digital environment. This study examines the limitations of state-of-the-art large language models (LLMs) in identifying offensive content within systematically perturbed data, with a focus on Chinese, a language particularly susceptible to such perturbations. We introduce \textsf{ToxiCloakCN}, an enhan… ▽ More Detecting hate speech and offensive language is essential for maintaining a safe and respectful digital environment. This study examines the limitations of state-of-the-art large language models (LLMs) in identifying offensive content within systematically perturbed data, with a focus on Chinese, a language particularly susceptible to such perturbations. We introduce \textsf{ToxiCloakCN}, an enhanced dataset derived from ToxiCN, augmented with homophonic substitutions and emoji transformations, to test the robustness of LLMs against these cloaking perturbations. Our findings reveal that existing models significantly underperform in detecting offensive content when these perturbations are applied. We provide an in-depth analysis of how different types of offensive content are affected by these perturbations and explore the alignment between human and model explanations of offensiveness. Our work highlights the urgent need for more advanced techniques in offensive language detection to combat the evolving tactics used to evade detection mechanisms. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 10 pages,5 Tables, 2 Figures

arXiv:2406.06717 [pdf, ps, other]

Analyzing user archetypes in Singapore's Telegram groups on COVID-19 and climate change

Authors: Val Alvern Cueco Ligo, Lan Tianxiang, Ying Zeng, Lam Yin Cheung, Pi Zonooz, Roy Ka-Wei Lee, Koustuv Saha, Edson C. Tandoc Jr., Navin Kumar

Abstract: Social media platforms, particularly Telegram, play a pivotal role in shaping public perceptions and opinions on global and national issues. Unlike traditional news media, Telegram allows for the proliferation of user-generated content with minimal oversight, making it a significant venue for the spread of controversial and misinformative content. During the COVID-19 pandemic, Telegram's popularit… ▽ More Social media platforms, particularly Telegram, play a pivotal role in shaping public perceptions and opinions on global and national issues. Unlike traditional news media, Telegram allows for the proliferation of user-generated content with minimal oversight, making it a significant venue for the spread of controversial and misinformative content. During the COVID-19 pandemic, Telegram's popularity surged in Singapore, a country with one of the highest rates of social media use globally. We leverage Singapore-based Telegram data to analyze information flows within groups focused on COVID-19 and climate change. Using k-means clustering, we identified distinct user archetypes, including Skeptic, Engaged Advocate, Observer, and Analyst, each contributing uniquely to the discourse. We developed a model to classify users into these clusters (Precision: Climate change: 0.99; COVID-19: 0.95). By identifying these user archetypes and examining their contributions to information dissemination, we sought to uncover patterns to inform effective strategies for combating misinformation and enhancing public discourse on pressing global issues. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.06474 [pdf, other]

Towards a Personal Health Large Language Model

Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We created and curated three datasets that test 1) production of personalized insights and recommendations from sleep patterns, physical activity, and physiological responses, 2) expert domain knowledge, and 3) prediction of self-reported sleep outcomes. For the first task we designed 857 case studies in collaboration with domain experts to assess real-world scenarios in sleep and fitness. Through comprehensive evaluation of domain-specific rubrics, we observed that Gemini Ultra 1.0 and PH-LLM are not statistically different from expert performance in fitness and, while experts remain superior for sleep, fine-tuning PH-LLM provided significant improvements in using relevant domain knowledge and personalizing information for sleep insights. We evaluated PH-LLM domain knowledge using multiple choice sleep medicine and fitness examinations. PH-LLM achieved 79% on sleep and 88% on fitness, exceeding average scores from a sample of human experts. Finally, we trained PH-LLM to predict self-reported sleep quality outcomes from textual and multimodal encoding representations of wearable data, and demonstrate that multimodal encoding is required to match performance of specialized discriminative models. Although further development and evaluation are necessary in the safety-critical personal health domain, these results demonstrate both the broad knowledge and capabilities of Gemini models and the benefit of contextualizing physiological data for personal health applications as done with PH-LLM. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 72 pages

arXiv:2406.02352 [pdf, other]

System-Aware Neural ODE Processes for Few-Shot Bayesian Optimization

Authors: Jixiang Qing, Becky D Langdon, Robert M Lee, Behrang Shafei, Mark van der Wilk, Calvin Tsay, Ruth Misener

Abstract: We consider the problem of optimizing initial conditions and timing in dynamical systems governed by unknown ordinary differential equations (ODEs), where evaluating different initial conditions is costly and there are constraints on observation times. To identify the optimal conditions within several trials, we introduce a few-shot Bayesian Optimization (BO) framework based on the system's prior… ▽ More We consider the problem of optimizing initial conditions and timing in dynamical systems governed by unknown ordinary differential equations (ODEs), where evaluating different initial conditions is costly and there are constraints on observation times. To identify the optimal conditions within several trials, we introduce a few-shot Bayesian Optimization (BO) framework based on the system's prior information. At the core of our approach is the System-Aware Neural ODE Processes (SANODEP), an extension of Neural ODE Processes (NODEP) designed to meta-learn ODE systems from multiple trajectories using a novel context embedding block. Additionally, we propose a multi-scenario loss function specifically for optimization purposes. Our two-stage BO framework effectively incorporates search space constraints, enabling efficient optimization of both initial conditions and observation timings. We conduct extensive experiments showcasing SANODEP's potential for few-shot BO. We also explore SANODEP's adaptability to varying levels of prior information, highlighting the trade-off between prior flexibility and model fitting accuracy. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00549 [pdf, other]

Zero Inflation as a Missing Data Problem: a Proxy-based Approach

Authors: Trung Phung, Jaron J. R. Lee, Opeyemi Oladapo-Shittu, Eili Y. Klein, Ayse Pinar Gurses, Susan M. Hannum, Kimberly Weems, Jill A. Marsteller, Sara E. Cosgrove, Sara C. Keller, Ilya Shpitser

Abstract: A common type of zero-inflated data has certain true values incorrectly replaced by zeros due to data recording conventions (rare outcomes assumed to be absent) or details of data recording equipment (e.g. artificial zeros in gene expression data). Existing methods for zero-inflated data either fit the observed data likelihood via parametric mixture models that explicitly represent excess zeros,… ▽ More A common type of zero-inflated data has certain true values incorrectly replaced by zeros due to data recording conventions (rare outcomes assumed to be absent) or details of data recording equipment (e.g. artificial zeros in gene expression data). Existing methods for zero-inflated data either fit the observed data likelihood via parametric mixture models that explicitly represent excess zeros, or aim to replace excess zeros by imputed values. If the goal of the analysis relies on knowing true data realizations, a particular challenge with zero-inflated data is identifiability, since it is difficult to correctly determine which observed zeros are real and which are inflated. This paper views zero-inflated data as a general type of missing data problem, where the observability indicator for a potentially censored variable is itself unobserved whenever a zero is recorded. We show that, without additional assumptions, target parameters involving a zero-inflated variable are not identified. However, if a proxy of the missingness indicator is observed, a modification of the effect restoration approach of Kuroki and Pearl allows identification and estimation, given the proxy-indicator relationship is known. If this relationship is unknown, our approach yields a partial identification strategy for sensitivity analysis. Specifically, we show that only certain proxy-indicator relationships are compatible with the observed data distribution. We give an analytic bound for this relationship in cases with a categorical outcome, which is sharp in certain models. For more complex cases, sharp numerical bounds may be computed using methods in Duarte et al.[2023]. We illustrate our method via simulation studies and a data application on central line-associated bloodstream infections (CLABSIs). △ Less

Submitted 2 July, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

Comments: 28 pages, 8 figues, accepted for the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

arXiv:2405.14791 [pdf, other]

Recurrent Early Exits for Federated Learning with Heterogeneous Clients

Authors: Royson Lee, Javier Fernandez-Marques, Shell Xu Hu, Da Li, Stefanos Laskaridis, Łukasz Dudziak, Timothy Hospedales, Ferenc Huszár, Nicholas D. Lane

Abstract: Federated learning (FL) has enabled distributed learning of a model across multiple clients in a privacy-preserving manner. One of the main challenges of FL is to accommodate clients with varying hardware capacities; clients have differing compute and memory requirements. To tackle this challenge, recent state-of-the-art approaches leverage the use of early exits. Nonetheless, these approaches fal… ▽ More Federated learning (FL) has enabled distributed learning of a model across multiple clients in a privacy-preserving manner. One of the main challenges of FL is to accommodate clients with varying hardware capacities; clients have differing compute and memory requirements. To tackle this challenge, recent state-of-the-art approaches leverage the use of early exits. Nonetheless, these approaches fall short of mitigating the challenges of joint learning multiple exit classifiers, often relying on hand-picked heuristic solutions for knowledge distillation among classifiers and/or utilizing additional layers for weaker classifiers. In this work, instead of utilizing multiple classifiers, we propose a recurrent early exit approach named ReeFL that fuses features from different sub-models into a single shared classifier. Specifically, we use a transformer-based early-exit module shared among sub-models to i) better exploit multi-layer feature representations for task-specific prediction and ii) modulate the feature representation of the backbone model for subsequent predictions. We additionally present a per-client self-distillation approach where the best sub-model is automatically selected as the teacher of the other sub-models at each client. Our experiments on standard image and speech classification benchmarks across various emerging federated fine-tuning baselines demonstrate ReeFL's effectiveness over previous works. △ Less

Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: Accepted at the 41st International Conference on Machine Learning (ICML 2024)

arXiv:2405.12594 [pdf, other]

Statistical Qubit Freezing Extending Physical Limit of Quantum Annealers

Authors: Jeung Rac Lee, June-Koo Kevin Rhee, Changjun Kim, Bo Hyun Choi

Abstract: Adiabatic quantum annealers encounter scalability challenges due to exponentially fast diminishing energy gaps between ground and excited states with qubit-count increase. This introduces errors in identifying ground states compounded by a thermal noise. We propose a novel algorithmic scheme called statistical qubit freezing (SQF) that selectively fixes the state of statistically deterministic qub… ▽ More Adiabatic quantum annealers encounter scalability challenges due to exponentially fast diminishing energy gaps between ground and excited states with qubit-count increase. This introduces errors in identifying ground states compounded by a thermal noise. We propose a novel algorithmic scheme called statistical qubit freezing (SQF) that selectively fixes the state of statistically deterministic qubit in the annealing Hamiltonian model of the given problem. Applying freezing repeatedly, SQF significantly enhances the spectral gap between of an adiabatic process, as an example, by up to 60\% compared to traditional annealing methods in the standard D-Wave's quantum Ising machine solution, effectively overcoming the fundamental limitations. △ Less

Submitted 27 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: 11 pages, 6 figures

arXiv:2405.12007 [pdf]

The Brightness of Starlink Mini Satellites During Orbit-Raising

Authors: Anthony Mallama, Richard E. Cole, Jay Respler, Scott Harrington, Ron Lee, Aaron Worley

Abstract: Observations of Starlink V2 Mini satellites during orbit-raising suggest that SpaceX applies brightness mitigation when they reach a height of 357 km. The mean apparent magnitudes for objects below that height threshold is 2.68 while the mean for those above is 6.46. When magnitudes are adjusted to a uniform distance of 1000 km the means are 4.58 and 7.52, respectively. The difference of 2.94 betw… ▽ More Observations of Starlink V2 Mini satellites during orbit-raising suggest that SpaceX applies brightness mitigation when they reach a height of 357 km. The mean apparent magnitudes for objects below that height threshold is 2.68 while the mean for those above is 6.46. When magnitudes are adjusted to a uniform distance of 1000 km the means are 4.58 and 7.52, respectively. The difference of 2.94 between distance-adjusted magnitudes above and below threshold implies that mitigation is 93% effective in reducing the brightness of orbit-raising spacecraft. Orbit-raising Mini spacecraft have a smaller impact on astronomical observations than higher altitude on-station spacecraft because they are relatively few in number. They also spend less time traversing the sky and spend longer in the Earth's shadow. These low-altitude objects will be more out-of-focus in large telescopes such as the LSST which reduces their impact, too. However, they attract considerable public attention and airline pilots have reported them as Unidentified Aerial Phenomena. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.10221 [pdf, other]

Scalarisation-based risk concepts for robust multi-objective optimisation

Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

Abstract: Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective case of this problem. We identify that the maj… ▽ More Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective case of this problem. We identify that the majority of all robust multi-objective algorithms rely on two key operations: robustification and scalarisation. Robustification refers to the strategy that is used to account for the uncertainty in the problem. Scalarisation refers to the procedure that is used to encode the relative importance of each objective to a scalar-valued reward. As these operations are not necessarily commutative, the order that they are performed in has an impact on the resulting solutions that are identified and the final decisions that are made. The purpose of this work is to give a thorough exposition on the effects of these different orderings and in particular highlight when one should opt for one ordering over the other. As part of our analysis, we showcase how many existing risk concepts can be integrated into the specification and solution of a robust multi-objective optimisation problem. Besides this, we also demonstrate how one can principally define the notion of a robust Pareto front and a robust performance metric based on our ``robustify and scalarise'' methodology. To illustrate the efficacy of these new ideas, we present two insightful case studies which are based on real-world data sets. △ Less

Submitted 15 July, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: The code is available at: https://github.com/benmltu/scalarize

arXiv:2405.08410 [pdf, other]

Classification of closed conformally flat Lorentzian manifolds with unipotent holonomy

Authors: Rachel Lee, Karin Melnick

Abstract: We classify closed, conformally flat Lorentzian manifolds of dimension $n \geq 3$ with unipotent holonomy in PO(2,n). They are all Kleinian and fall into four different geometric types according to the intersection of the image of the developing map with a holonomy-invariant isotropic flag. They are homeomorphic to $S^{n-1} \times S^1$ or a nilmanifold of degree at most three, up to a finite cover… ▽ More We classify closed, conformally flat Lorentzian manifolds of dimension $n \geq 3$ with unipotent holonomy in PO(2,n). They are all Kleinian and fall into four different geometric types according to the intersection of the image of the developing map with a holonomy-invariant isotropic flag. They are homeomorphic to $S^{n-1} \times S^1$ or a nilmanifold of degree at most three, up to a finite cover. We classify those admitting an essential conformal flow; these fall into two geometric types, both homeomorphic to $S^{n-1} \times S^1$ up to finite cover. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 34 pages, 3 figures

MSC Class: 53C50; 57N16

arXiv:2405.03679 [pdf, other]

A topological model for the HOMFLY-PT polynomial

Authors: Cristina Ana-Maria Anghel, Christine Ruey Shan Lee

Abstract: We give the first known topological model for the HOMFLY-PT polynomial. More precisely, we prove that this invariant is given by a set of graded intersections between explicit Lagrangian submanifolds in a fixed configuration space on a Heegaard surface for the link exterior. The submanifolds are supported on arcs and ovals on the surface. The construction also leads to a topological model for th… ▽ More We give the first known topological model for the HOMFLY-PT polynomial. More precisely, we prove that this invariant is given by a set of graded intersections between explicit Lagrangian submanifolds in a fixed configuration space on a Heegaard surface for the link exterior. The submanifolds are supported on arcs and ovals on the surface. The construction also leads to a topological model for the Jones polynomial constructed from Heegaard surfaces associated directly to the link diagram. In particular, it does not rely on a choice of a braid representative for the link. This opens up new avenues for investigation of the geometry of these invariants, as well as categorifications of geometric nature. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 47 pages, comments welcome

arXiv:2405.02812 [pdf, other]

Neural Network Enhanced Single-Photon Fock State Tomography

Authors: Hsien-Yi Hsieh, Yi-Ru Chen, Jingyu Ning, Hsun-Chung Wu, Hua Li Chen, Zi-Hao Shi, Po-Han Wang, Ole Steuernagel, Chien-Ming Wu, Ray-Kuang Lee

Abstract: Even though heralded single-photon sources have been generated routinely through the spontaneous parametric down conversion, vacuum and multiple photon states are unavoidably involved. With machine-learning, we report the experimental implementation of single-photon quantum state tomography by directly estimating target parameters. Compared to the Hanbury Brown and Twiss (HBT) measurements only wi… ▽ More Even though heralded single-photon sources have been generated routinely through the spontaneous parametric down conversion, vacuum and multiple photon states are unavoidably involved. With machine-learning, we report the experimental implementation of single-photon quantum state tomography by directly estimating target parameters. Compared to the Hanbury Brown and Twiss (HBT) measurements only with clicked events recorded, our neural network enhanced quantum state tomography characterizes the photon number distribution for all possible photon number states from the balanced homodyne detectors. By using the histogram-based architecture, a direct parameter estimation on the negativity in Wigner's quasi-probability phase space is demonstrated. Such a fast, robust, and precise quantum state tomography provides us a crucial diagnostic toolbox for the applications with single-photon Fock states and other non-Gaussisan quantum states. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: 8 pages, 8 figures

arXiv:2405.01842 [pdf, ps, other]

SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore

Authors: Ri Chi Ng, Nirmalendu Prakash, Ming Shan Hee, Kenny Tsu Wei Choo, Roy Ka-Wei Lee

Abstract: To address the limitations of current hate speech detection models, we introduce \textsf{SGHateCheck}, a novel framework designed for the linguistic and cultural context of Singapore and Southeast Asia. It extends the functional testing approach of HateCheck and MHC, employing large language models for translation and paraphrasing into Singapore's main languages, and refining these with native ann… ▽ More To address the limitations of current hate speech detection models, we introduce \textsf{SGHateCheck}, a novel framework designed for the linguistic and cultural context of Singapore and Southeast Asia. It extends the functional testing approach of HateCheck and MHC, employing large language models for translation and paraphrasing into Singapore's main languages, and refining these with native annotators. \textsf{SGHateCheck} reveals critical flaws in state-of-the-art models, highlighting their inadequacy in sensitive content moderation. This work aims to foster the development of more effective hate speech detection tools for diverse linguistic environments, particularly for Singapore and Southeast Asia contexts. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.01404 [pdf, other]

Random Pareto front surfaces

Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

Abstract: The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordina… ▽ More The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordinates. More precisely, we show that any Pareto front surface can be equivalently represented using a scalar-valued length function which returns the projected length along any positive radial direction. We then use this representation in order to rigorously develop the theory and applications of stochastic Pareto front surfaces. In particular, we derive many Pareto front surface statistics of interest such as the expectation, covariance and quantiles. We then discuss how these can be used in practice within a design of experiments setting, where the goal is to both infer and use the Pareto front surface distribution in order to make effective decisions. Our framework allows for clear uncertainty quantification and we also develop advanced visualisation techniques for this purpose. Finally we discuss the applicability of our ideas within multivariate extreme value theory and illustrate our methodology in a variety of numerical examples, including a case study with a real-world air pollution data set. △ Less

Submitted 21 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: The code is available at: https://github.com/benmltu/scalarize

arXiv:2404.17667 [pdf, other]

SiamQuality: A ConvNet-Based Foundation Model for Imperfect Physiological Signals

Authors: Cheng Ding, Zhicheng Guo, Zhaoliang Chen, Randall J Lee, Cynthia Rudin, Xiao Hu

Abstract: Foundation models, especially those using transformers as backbones, have gained significant popularity, particularly in language and language-vision tasks. However, large foundation models are typically trained on high-quality data, which poses a significant challenge, given the prevalence of poor-quality real-world data. This challenge is more pronounced for developing foundation models for phys… ▽ More Foundation models, especially those using transformers as backbones, have gained significant popularity, particularly in language and language-vision tasks. However, large foundation models are typically trained on high-quality data, which poses a significant challenge, given the prevalence of poor-quality real-world data. This challenge is more pronounced for developing foundation models for physiological data; such data are often noisy, incomplete, or inconsistent. The present work aims to provide a toolset for developing foundation models on physiological data. We leverage a large dataset of photoplethysmography (PPG) signals from hospitalized intensive care patients. For this data, we propose SimQuality, a novel self-supervised learning task based on convolutional neural networks (CNNs) as the backbone to enforce representations to be similar for good and poor quality signals that are from similar physiological states. We pre-trained the SimQuality on over 36 million 30-second PPG pairs and then fine-tuned and tested on six downstream tasks using external datasets. The results demonstrate the superiority of the proposed approach on all the downstream tasks, which are extremely important for heart monitoring on wearable devices. Our method indicates that CNNs can be an effective backbone for foundation models that are robust to training data quality. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.15353 [pdf, other]

SQUWA: Signal Quality Aware DNN Architecture for Enhanced Accuracy in Atrial Fibrillation Detection from Noisy PPG Signals

Authors: Runze Yan, Cheng Ding, Ran Xiao, Aleksandr Fedorov, Randall J Lee, Fadi Nahab, Xiao Hu

Abstract: Atrial fibrillation (AF), a common cardiac arrhythmia, significantly increases the risk of stroke, heart disease, and mortality. Photoplethysmography (PPG) offers a promising solution for continuous AF monitoring, due to its cost efficiency and integration into wearable devices. Nonetheless, PPG signals are susceptible to corruption from motion artifacts and other factors often encountered in ambu… ▽ More Atrial fibrillation (AF), a common cardiac arrhythmia, significantly increases the risk of stroke, heart disease, and mortality. Photoplethysmography (PPG) offers a promising solution for continuous AF monitoring, due to its cost efficiency and integration into wearable devices. Nonetheless, PPG signals are susceptible to corruption from motion artifacts and other factors often encountered in ambulatory settings. Conventional approaches typically discard corrupted segments or attempt to reconstruct original signals, allowing for the use of standard machine learning techniques. However, this reduces dataset size and introduces biases, compromising prediction accuracy and the effectiveness of continuous monitoring. We propose a novel deep learning model, Signal Quality Weighted Fusion of Attentional Convolution and Recurrent Neural Network (SQUWA), designed to learn how to retain accurate predictions from partially corrupted PPG. Specifically, SQUWA innovatively integrates an attention mechanism that directly considers signal quality during the learning process, dynamically adjusting the weights of time series segments based on their quality. This approach enhances the influence of higher-quality segments while reducing that of lower-quality ones, effectively utilizing partially corrupted segments. This approach represents a departure from the conventional methods that exclude such segments, enabling the utilization of a broader range of data, which has great implications for less disruption when monitoring of AF risks and more accurate estimation of AF burdens. Our extensive experiments show that SQUWA outperform existing PPG-based models, achieving the highest AUCPR of 0.89 with label noise mitigation. This also exceeds the 0.86 AUCPR of models trained with using both electrocardiogram (ECG) and PPG data. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 15 pages; 9 figures; 2024 Conference on Health, Inference, and Learning (CHIL)

arXiv:2404.14219 [pdf, other]

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered publicly available web data and synthetic data. The model is also further aligned for robustness, safety, and chat format. We also provide some initial parameter-scaling results with a 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both significantly more capable than phi-3-mini (e.g., respectively 75% and 78% on MMLU, and 8.7 and 8.9 on MT-bench). Moreover, we also introduce phi-3-vision, a 4.2 billion parameter model based on phi-3-mini with strong reasoning capabilities for image and text prompts. △ Less

Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: 19 pages

arXiv:2404.09959 [pdf, other]

NNLO QCD corrections to polarized semi-inclusive DIS

Authors: Saurav Goyal, Roman N. Lee, Sven-Olaf Moch, Vaibhav Pathak, Narayan Rana, V. Ravindran

Abstract: Polarized semi-inclusive deep-inelastic scattering (SIDIS) is a key process in the quest for a resolution of the proton spin puzzle. We present the complete results for the polarized SIDIS process at next-to-next-to-leading order (NNLO) in perturbative quantum chromodynamics. Our analytical results include all partonic channels for the scattering of polarized leptons off hadrons and a spin-average… ▽ More Polarized semi-inclusive deep-inelastic scattering (SIDIS) is a key process in the quest for a resolution of the proton spin puzzle. We present the complete results for the polarized SIDIS process at next-to-next-to-leading order (NNLO) in perturbative quantum chromodynamics. Our analytical results include all partonic channels for the scattering of polarized leptons off hadrons and a spin-averaged hadron identified in the final state. A numerical analysis of the NNLO corrections illustrates their significance and the reduced residual scale dependence in the kinematic range probed by the future Electron-Ion-Collider EIC. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 6 pages, 2 figures; 1 ancillary file

arXiv:2404.09904 [pdf, other]

Electrical control of valley polarized charged biexcitons in monolayer WS$_2$

Authors: Sarthak Das, Ding Huang, Ivan Verzhbitskiy, Zi-En Ooi, Chit Siong Lau, Rainer Lee, Calvin Pei Yu Wong, Kuan Eng Johnson Goh

Abstract: Excitons are key to the optoelectronic applications of van der Waals semiconductors with the potential for versatile on-demand tuning of properties. Yet, their electrical manipulation is complicated by their inherent charge neutrality and the additional loss channels induced by electrical doping. We demonstrate the dynamic control of valley polarization in charged biexciton (quinton) states of mon… ▽ More Excitons are key to the optoelectronic applications of van der Waals semiconductors with the potential for versatile on-demand tuning of properties. Yet, their electrical manipulation is complicated by their inherent charge neutrality and the additional loss channels induced by electrical doping. We demonstrate the dynamic control of valley polarization in charged biexciton (quinton) states of monolayer tungsten disulfide, achieving up to a sixfold increase in the degree of circular polarization under off-resonant excitation. In contrast to the weak direct tuning of excitons typically observed using electrical gating, the quinton photoluminescence remains stable, even with increased scattering from electron doping. By exciting at the exciton resonances, we observed the reproducible non-monotonic switching of the charged state population as the electron doping is varied under gate bias, indicating a coherent interplay between neutral and charged exciton states. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.06602 [pdf, ps, other]

A General Identification Algorithm For Data Fusion Problems Under Systematic Selection

Authors: Jaron J. R. Lee, AmirEmad Ghassami, Ilya Shpitser

Abstract: Causal inference is made challenging by confounding, selection bias, and other complications. A common approach to addressing these difficulties is the inclusion of auxiliary data on the superpopulation of interest. Such data may measure a different set of variables, or be obtained under different experimental conditions than the primary dataset. Analysis based on multiple datasets must carefully… ▽ More Causal inference is made challenging by confounding, selection bias, and other complications. A common approach to addressing these difficulties is the inclusion of auxiliary data on the superpopulation of interest. Such data may measure a different set of variables, or be obtained under different experimental conditions than the primary dataset. Analysis based on multiple datasets must carefully account for similarities between datasets, while appropriately accounting for differences. In addition, selection of experimental units into different datasets may be systematic; similar difficulties are encountered in missing data problems. Existing methods for combining datasets either do not consider this issue, or assume simple selection mechanisms. In this paper, we provide a general approach, based on graphical causal models, for causal inference from data on the same superpopulation that is obtained under different experimental conditions. Our framework allows both arbitrary unobserved confounding, and arbitrary selection processes into different experimental regimes in our data. We describe how systematic selection processes may be organized into a hierarchy similar to censoring processes in missing data: selected completely at random (SCAR), selected at random (SAR), and selected not at random (SNAR). In addition, we provide a general identification algorithm for interventional distributions in this setting. △ Less

Submitted 15 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 17 pages

arXiv:2404.04248 [pdf, other]

doi 10.3847/2041-8213/ad5beb

Observation of Gravitational Waves from the Coalescence of a $2.5\text{-}4.5~M_\odot$ Compact Object and a Neutron Star

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, S. Akçay, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah , et al. (1771 additional authors not shown)

Abstract: We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the so… ▽ More We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the source has a mass less than $5~M_\odot$ at 99% credibility. We cannot definitively determine from gravitational-wave data alone whether either component of the source is a neutron star or a black hole. However, given existing estimates of the maximum neutron star mass, we find the most probable interpretation of the source to be the coalescence of a neutron star with a black hole that has a mass between the most massive neutron stars and the least massive black holes observed in the Galaxy. We provisionally estimate a merger rate density of $55^{+127}_{-47}~\text{Gpc}^{-3}\,\text{yr}^{-1}$ for compact binary coalescences with properties similar to the source of GW230529_181500; assuming that the source is a neutron star-black hole merger, GW230529_181500-like sources constitute about 60% of the total merger rate inferred for neutron star-black hole coalescences. The discovery of this system implies an increase in the expected rate of neutron star-black hole mergers with electromagnetic counterparts and provides further evidence for compact objects existing within the purported lower mass gap. △ Less

Submitted 26 July, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 45 pages (10 pages author list, 13 pages main text, 1 page acknowledgements, 13 pages appendices, 8 pages bibliography), 17 figures, 16 tables. Update to match version published in The Astrophysical Journal Letters. Data products available from https://zenodo.org/records/10845779

Report number: LIGO-P2300352

Journal ref: ApJL 970, L34 (2024)

arXiv:2404.03991 [pdf, other]

Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling

Authors: Shahzad Ali, Yu Rim Lee, Soo Young Park, Won Young Tak, Soon Ki Jung

Abstract: Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off be… ▽ More Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off between efficiency and accuracy, with higher downsampling factors further impairing segmentation outcomes. Preserving information during downsampling is especially critical for medical image segmentation tasks. To tackle this challenge, we introduce a novel method named Edge-preserving Probabilistic Downsampling (EPD). It utilizes class uncertainty within a local window to produce soft labels, with the window size dictating the downsampling factor. This enables a network to produce quality predictions at low resolutions. Beyond preserving edge details more effectively than conventional nearest-neighbor downsampling, employing a similar algorithm for images, it surpasses bilinear interpolation in image downsampling, enhancing overall performance. Our method significantly improved Intersection over Union (IoU) to 2.85%, 8.65%, and 11.89% when downsampling data to 1/2, 1/4, and 1/8, respectively, compared to conventional interpolation methods. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 5 pages (4 figures, 1 table); This work has been submitted to the IEEE Signal Processing Letters. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2404.01353 [pdf, other]

Efficiently Distilling LLMs for Edge Applications

Authors: Achintya Kundu, Fabian Lim, Aaron Chew, Laura Wynter, Penny Chong, Rhui Dih Lee

Abstract: Supernet training of LLMs is of great interest in industrial applications as it confers the ability to produce a palette of smaller models at constant cost, regardless of the number of models (of different size / latency) produced. We propose a new method called Multistage Low-rank Fine-tuning of Super-transformers (MLFS) for parameter-efficient supernet training. We show that it is possible to ob… ▽ More Supernet training of LLMs is of great interest in industrial applications as it confers the ability to produce a palette of smaller models at constant cost, regardless of the number of models (of different size / latency) produced. We propose a new method called Multistage Low-rank Fine-tuning of Super-transformers (MLFS) for parameter-efficient supernet training. We show that it is possible to obtain high-quality encoder models that are suitable for commercial edge applications, and that while decoder-only models are resistant to a comparable degree of compression, decoders can be effectively sliced for a significant reduction in training time. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: This paper has been accepted for publication in NAACL 2024 (Industry Track)

arXiv:2404.01104 [pdf, other]

SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity

Authors: Jaemin Kim, Yohan Na, Kangmin Kim, Sang Rak Lee, Dong-Kyu Chae

Abstract: Recently, sentiment-aware pre-trained language models (PLMs) demonstrate impressive results in downstream sentiment analysis tasks. However, they neglect to evaluate the quality of their constructed sentiment representations; they just focus on improving the fine-tuning performance, which overshadows the representation quality. We argue that without guaranteeing the representation quality, their d… ▽ More Recently, sentiment-aware pre-trained language models (PLMs) demonstrate impressive results in downstream sentiment analysis tasks. However, they neglect to evaluate the quality of their constructed sentiment representations; they just focus on improving the fine-tuning performance, which overshadows the representation quality. We argue that without guaranteeing the representation quality, their downstream performance can be highly dependent on the supervision of the fine-tuning data rather than representation quality. This problem would make them difficult to foray into other sentiment-related domains, especially where labeled data is scarce. We first propose Sentiment-guided Textual Similarity (SgTS), a novel metric for evaluating the quality of sentiment representations, which is designed based on the degree of equivalence in sentiment polarity between two sentences. We then propose SentiCSE, a novel Sentiment-aware Contrastive Sentence Embedding framework for constructing sentiment representations via combined word-level and sentence-level objectives, whose quality is guaranteed by SgTS. Qualitative and quantitative comparison with the previous sentiment-aware PLMs shows the superiority of our work. Our code is available at: https://github.com/nayohan/SentiCSE △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 14 pages, 8 figures

MSC Class: 68T50 ACM Class: I.2.7

Journal ref: LREC-COLING2024

arXiv:2403.19214 [pdf]

Convolutional network learning of self-consistent electron density via grid-projected atomic fingerprints

Authors: Ryong-Gyu Lee, Yong-Hoon Kim

Abstract: The self-consistent field (SCF) generation of the three-dimensional (3D) electron density distribution ($ρ$) represents a fundamental aspect of density functional theory (DFT) and related first-principles calculations, and how one can shorten or bypass the SCF loop represents a critical question from both practical and fundamental standpoints. Herein, a machine learning strategy DeepSCF is present… ▽ More The self-consistent field (SCF) generation of the three-dimensional (3D) electron density distribution ($ρ$) represents a fundamental aspect of density functional theory (DFT) and related first-principles calculations, and how one can shorten or bypass the SCF loop represents a critical question from both practical and fundamental standpoints. Herein, a machine learning strategy DeepSCF is presented in which the map between the SCF $ρ$ and the initial guess density ($ρ_0$) constructed by the summation of neutral atomic densities is learned using 3D convolutional neural networks (CNNs). High accuracy and transferability of DeepSCF are achieved by expanding the input features to include atomic fingerprints beyond $ρ_0$ and encoding them on a 3D grid. The prediction of the residual density ($δρ$) rather than $ρ$ itself is targeted, and, since $δρ$ corresponds to chemical bonding information, a dataset of small-sized organic molecules featuring diverse bonding characters is adopted. After enhancing the fidelity of the method by subjecting the atomic geometries in the dataset to random strains and rotations, the effectiveness of DeepSCF is finally demonstrated using a complex large carbon nanotube-based DNA sequencer model. This work evidences that the nearsightedness in electronic structures can be optimally represented via the local connectivity in CNNs. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 9 pages, 6 figures

arXiv:2403.14652 [pdf, other]

doi 10.1145/3589334.3648151

MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation

Authors: Han Wang, Roy Ka-Wei Lee

Abstract: Online memes have emerged as powerful digital cultural artifacts in the age of social media, offering not only humor but also platforms for political discourse, social critique, and information dissemination. Their extensive reach and influence in shaping online communities' sentiments make them invaluable tools for campaigning and promoting ideologies. Despite the development of several meme-gene… ▽ More Online memes have emerged as powerful digital cultural artifacts in the age of social media, offering not only humor but also platforms for political discourse, social critique, and information dissemination. Their extensive reach and influence in shaping online communities' sentiments make them invaluable tools for campaigning and promoting ideologies. Despite the development of several meme-generation tools, there remains a gap in their systematic evaluation and their ability to effectively communicate ideologies. Addressing this, we introduce MemeCraft, an innovative meme generator that leverages large language models (LLMs) and visual language models (VLMs) to produce memes advocating specific social movements. MemeCraft presents an end-to-end pipeline, transforming user prompts into compelling multimodal memes without manual intervention. Conscious of the misuse potential in creating divisive content, an intrinsic safety mechanism is embedded to curb hateful meme production. △ Less

Submitted 24 February, 2024; originally announced March 2024.

Comments: 8 pages, 7 figures, ACM MM 2024

ACM Class: I.2.7; I.2.10

arXiv:2403.12249 [pdf, ps, other]

Asymptotic spreading of predator-prey populations in a shifting environment

Authors: King-Yeung Lam, Ray Lee

Abstract: Inspired by recent studies associating shifting temperature conditions with changes in the efficiency of predator species in converting their prey to offspring, we propose a predator-prey model of reaction-diffusion type to analyze the consequence of such effects on the population dynamics and spread of species. In the model, the predator conversion efficiency is represented by a spatially heterog… ▽ More Inspired by recent studies associating shifting temperature conditions with changes in the efficiency of predator species in converting their prey to offspring, we propose a predator-prey model of reaction-diffusion type to analyze the consequence of such effects on the population dynamics and spread of species. In the model, the predator conversion efficiency is represented by a spatially heterogeneous function depending on the variable $ξ=x-c_1t$ for some given $c_1>0$. Using the Hamilton-Jacobi approach, we provide explicit formulas for the spreading speed of the predator species. When the conversion function is monotone increasing, the spreading speed is determined in all cases and non-local pulling is possible. When the function is monotone decreasing, we provide formulas for the spreading speed when the rate of shift of the conversion function is sufficiently fast or slow. △ Less

Submitted 16 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

MSC Class: 35B40; 35K57; 35R10; 35D40

arXiv:2403.03004 [pdf, other]

Ultralight vector dark matter search using data from the KAGRA O3GK run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures

Report number: LIGO-P2300250

arXiv:2402.17971 [pdf, other]

All in an Aggregated Image for In-Image Learning

Authors: Lei Wang, Wanyu Xu, Zhiqiang Hu, Yihuai Lan, Shan Dong, Hao Wang, Roy Ka-Wei Lee, Ee-Peng Lim

Abstract: This paper introduces a new in-context learning (ICL) mechanism called In-Image Learning (I$^2$L) that combines demonstration examples, visual cues, and chain-of-thought reasoning into an aggregated image to enhance the capabilities of Large Multimodal Models (e.g., GPT-4V) in multimodal reasoning tasks. Unlike previous approaches that rely on converting images to text or incorporating visual inpu… ▽ More This paper introduces a new in-context learning (ICL) mechanism called In-Image Learning (I$^2$L) that combines demonstration examples, visual cues, and chain-of-thought reasoning into an aggregated image to enhance the capabilities of Large Multimodal Models (e.g., GPT-4V) in multimodal reasoning tasks. Unlike previous approaches that rely on converting images to text or incorporating visual input into language models, I$^2$L consolidates all information into an aggregated image and leverages image processing, understanding, and reasoning abilities. This has several advantages: it reduces inaccurate textual descriptions of complex images, provides flexibility in positioning demonstration examples, and avoids multiple input images and lengthy prompts. We also introduce I$^2$L-Hybrid, a method that combines the strengths of I$^2$L with other ICL methods. Specifically, it uses an automatic strategy to select the most suitable method (I$^2$L or another certain ICL method) for a specific task instance. We conduct extensive experiments to assess the effectiveness of I$^2$L and I$^2$L-Hybrid on MathVista, which covers a variety of complex multimodal reasoning tasks. Additionally, we investigate the influence of image resolution, the number of demonstration examples in a single image, and the positions of these demonstrations in the aggregated image on the effectiveness of I$^2$L. Our code is publicly available at https://github.com/AGI-Edgerunners/IIL. △ Less

Submitted 2 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: Preprint

arXiv:2402.15103 [pdf]

doi 10.1038/s41524-024-01242-5

Ab initio calculation of the nonequilibrium adsorption energy

Authors: Juho Lee, Hyeonwoo Yeo, Ryong-Gyu Lee, Yong-Hoon Kim

Abstract: While first-principles calculations of electrode-molecule binding play an indispensable role in obtaining atomic-level understanding in surface science and electrochemistry, a significant challenge remains because the adsorption energy is well-defined only in equilibrium. Herein, a theory to calculate the electric enthalpy for electrochemical interfaces is formulated within the multi-space constra… ▽ More While first-principles calculations of electrode-molecule binding play an indispensable role in obtaining atomic-level understanding in surface science and electrochemistry, a significant challenge remains because the adsorption energy is well-defined only in equilibrium. Herein, a theory to calculate the electric enthalpy for electrochemical interfaces is formulated within the multi-space constrained-search density functional theory (MS-DFT), which provides the nonequilibrium total energy of a nanoscale electrode-channel-electrode junction. An additional MS-DFT calculation for the electrode-only counterpart that maintains the same bias voltage allows one to identify the internal energy of the channel as well as the electric field and the channel polarization, which together determine the electric enthalpy and the nonequilibrium adsorption energy. Application of the developed scheme to the water-Au and water-graphene interface models shows that the Au and graphene electrodes induce very different behaviors in terms of the electrode potential-dependent stabilization of water configurations. The theory developed here will be a valuable tool in the ongoing effort to obtain an atomic-scale understanding of bias-dependent molecular reorganizations in electrified interfaces. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: 8 pages, 4 figures

Journal ref: npj Comput. Mater. 10, 60 (2024)

arXiv:2402.12647 [pdf, other]

DiffusionNOCS: Managing Symmetry and Uncertainty in Sim2Real Multi-Modal Category-level Pose Estimation

Authors: Takuya Ikeda, Sergey Zakharov, Tianyi Ko, Muhammad Zubair Irshad, Robert Lee, Katherine Liu, Rares Ambrus, Koichi Nishiwaki

Abstract: This paper addresses the challenging problem of category-level pose estimation. Current state-of-the-art methods for this task face challenges when dealing with symmetric objects and when attempting to generalize to new environments solely through synthetic data training. In this work, we address these challenges by proposing a probabilistic model that relies on diffusion to estimate dense canonic… ▽ More This paper addresses the challenging problem of category-level pose estimation. Current state-of-the-art methods for this task face challenges when dealing with symmetric objects and when attempting to generalize to new environments solely through synthetic data training. In this work, we address these challenges by proposing a probabilistic model that relies on diffusion to estimate dense canonical maps crucial for recovering partial object shapes as well as establishing correspondences essential for pose estimation. Furthermore, we introduce critical components to enhance performance by leveraging the strength of the diffusion models with multi-modal input representations. We demonstrate the effectiveness of our method by testing it on a range of real datasets. Despite being trained solely on our generated synthetic data, our approach achieves state-of-the-art performance and unprecedented generalization qualities, outperforming baselines, even those specifically trained on the target domain. △ Less

Submitted 5 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: 8 pages. 9 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2402.11845 [pdf, other]

Modularized Networks for Few-shot Hateful Meme Detection

Authors: Rui Cao, Roy Ka-Wei Lee, Jing Jiang

Abstract: In this paper, we address the challenge of detecting hateful memes in the low-resource setting where only a few labeled examples are available. Our approach leverages the compositionality of Low-rank adaptation (LoRA), a widely used parameter-efficient tuning technique. We commence by fine-tuning large language models (LLMs) with LoRA on selected tasks pertinent to hateful meme detection, thereby… ▽ More In this paper, we address the challenge of detecting hateful memes in the low-resource setting where only a few labeled examples are available. Our approach leverages the compositionality of Low-rank adaptation (LoRA), a widely used parameter-efficient tuning technique. We commence by fine-tuning large language models (LLMs) with LoRA on selected tasks pertinent to hateful meme detection, thereby generating a suite of LoRA modules. These modules are capable of essential reasoning skills for hateful meme detection. We then use the few available annotated samples to train a module composer, which assigns weights to the LoRA modules based on their relevance. The model's learnable parameters are directly proportional to the number of LoRA modules. This modularized network, underpinned by LLMs and augmented with LoRA modules, exhibits enhanced generalization in the context of hateful meme detection. Our evaluation spans three datasets designed for hateful meme detection in a few-shot learning context. The proposed method demonstrates superior performance to traditional in-context learning, which is also more computationally intensive during inference.We then use the few available annotated samples to train a module composer, which assigns weights to the LoRA modules based on their relevance. The model's learnable parameters are directly proportional to the number of LoRA modules. This modularized network, underpinned by LLMs and augmented with LoRA modules, exhibits enhanced generalization in the context of hateful meme detection. Our evaluation spans three datasets designed for hateful meme detection in a few-shot learning context. The proposed method demonstrates superior performance to traditional in-context learning, which is also more computationally intensive during inference. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: camera-ready for WWW, 2024, Web4Good

arXiv:2402.08406 [pdf, other]

Transition Constrained Bayesian Optimization via Markov Decision Processes

Authors: Jose Pablo Folch, Calvin Tsay, Robert M Lee, Behrang Shafei, Weronika Ormaniec, Andreas Krause, Mark van der Wilk, Ruth Misener, Mojmír Mutný

Abstract: Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, r… ▽ More Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, required monotonicity in certain variables, and transitions influencing the accuracy of measurements. Altogether, such transition constraints necessitate a form of planning. This work extends classical Bayesian optimization via the framework of Markov Decision Processes. We iteratively solve a tractable linearization of our utility function using reinforcement learning to obtain a policy that plans ahead for the entire horizon. This is a parallel to the optimization of an acquisition function in policy space. The resulting policy is potentially history-dependent and non-Markovian. We showcase applications in chemical reactor optimization, informative path planning, machine calibration, and other synthetic examples. △ Less

Submitted 29 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Comments: 10 pages main, 32 pages total, 16 figures, 2 tables, preprint

arXiv:2402.01707 [pdf, other]

Revitalizing Sex Education for Chinese Children: A Formative Study

Authors: Kyrie Zhixuan Zhou, Yilin Zhu, Jingwen Shan, Madelyn Rose Sanfilippo, Hee Rin Lee

Abstract: Sex education helps children obtain knowledge and awareness of sexuality, and protects them against sexually transmitted diseases, pregnancy, and sexual abuse. Sex education is not well taught to children in China -- both school-based education and parental communication on this topic are limited. To interrogate the status quo of sex education in China and explore suitable interventions, we conduc… ▽ More Sex education helps children obtain knowledge and awareness of sexuality, and protects them against sexually transmitted diseases, pregnancy, and sexual abuse. Sex education is not well taught to children in China -- both school-based education and parental communication on this topic are limited. To interrogate the status quo of sex education in China and explore suitable interventions, we conducted a series of formative studies including interviews and social media analysis. Multiple stakeholders such as children, parents, education practitioners, and the general public were engaged for an in-depth understanding of their unique needs regarding teaching and learning sex education. We found that school-based sex education for Chinese children was currently insufficient and restrictive. Involving parents in sex education posed several challenges, such as a lack of sexuality and pedagogy knowledge, and embarrassment in initiating sex education conversations. Culture and politics were major hurdles to effective sex education. Based on the findings, we reflect on the complex interactions between culture, politics, education policy, and pedagogy, and discuss situated design of sex education in broader cultural and social contexts. △ Less

Submitted 25 January, 2024; originally announced February 2024.

arXiv:2402.01185 [pdf, other]

Nano-ironing van der Waals Heterostructures Towards Electrically Controlled Quantum Dots

Authors: Teymour Talha-Dean, Yaoju Tarn, Subhrajit Mukherjee, John Wellington John, Ding Huang, Ivan A. Verzhbitskiy, Dasari Venkatakrishnarao, Sarthak Das, Rainer Lee, Abhishek Mishra, Shuhua Wang, Yee Sin Ang, Kuan Eng Johnson Goh, Chit Siong Lau

Abstract: Assembling two-dimensional van der Waals layered materials into heterostructures is an exciting development that sparked the discovery of rich correlated electronic phenomena and offers possibilities for designer device applications. However, resist residue from fabrication processes is a major limitation. Resulting disordered interfaces degrade device performance and mask underlying transport phy… ▽ More Assembling two-dimensional van der Waals layered materials into heterostructures is an exciting development that sparked the discovery of rich correlated electronic phenomena and offers possibilities for designer device applications. However, resist residue from fabrication processes is a major limitation. Resulting disordered interfaces degrade device performance and mask underlying transport physics. Conventional cleaning processes are inefficient and can cause material and device damage. Here, we show that thermal scanning probe based cleaning can effectively eliminate resist residue to recover pristine material surfaces. Our technique is compatible at both the material- and device-level, and we demonstrate the significant improvement in the electrical performance of 2D WS2 transistors. We also demonstrate the cleaning of van der Waals heterostructures to achieve interfaces with low disorder. This enables the electrical formation and control of quantum dots that can be tuned from macroscopic current flow to the single-electron tunnelling regime. Such material processing advances are crucial for constructing high-quality vdW heterostructures that are important platforms for fundamental studies and building blocks for quantum and nano-electronics applications. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 4 Figures

arXiv:2401.17193 [pdf, other]

Widths of links via diagram colorings

Authors: Ricky Lee, Puttipong Pongtanapaisan, Hanh Vo

Abstract: In this paper, we define invariants of links in terms of colorings of link diagrams and prove that these invariants coincide with various notions of widths of links with respect to the standard Morse function. Our formulations are advantageous because they are algorithmic and suitable for program implementations. As an application, we calculate the max-width of over 10000 links up to 14 crossings… ▽ More In this paper, we define invariants of links in terms of colorings of link diagrams and prove that these invariants coincide with various notions of widths of links with respect to the standard Morse function. Our formulations are advantageous because they are algorithmic and suitable for program implementations. As an application, we calculate the max-width of over 10000 links up to 14 crossings from the link table. △ Less

Submitted 30 January, 2024; originally announced January 2024.

arXiv:2401.16727 [pdf, other]

Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models

Authors: Ming Shan Hee, Shivam Sharma, Rui Cao, Palash Nandi, Tanmoy Chakraborty, Roy Ka-Wei Lee

Abstract: In the evolving landscape of online communication, moderating hate speech (HS) presents an intricate challenge, compounded by the multimodal nature of digital content. This comprehensive survey delves into the recent strides in HS moderation, spotlighting the burgeoning role of large language models (LLMs) and large multimodal models (LMMs). Our exploration begins with a thorough analysis of curre… ▽ More In the evolving landscape of online communication, moderating hate speech (HS) presents an intricate challenge, compounded by the multimodal nature of digital content. This comprehensive survey delves into the recent strides in HS moderation, spotlighting the burgeoning role of large language models (LLMs) and large multimodal models (LMMs). Our exploration begins with a thorough analysis of current literature, revealing the nuanced interplay between textual, visual, and auditory elements in propagating HS. We uncover a notable trend towards integrating these modalities, primarily due to the complexity and subtlety with which HS is disseminated. A significant emphasis is placed on the advances facilitated by LLMs and LMMs, which have begun to redefine the boundaries of detection and moderation capabilities. We identify existing gaps in research, particularly in the context of underrepresented languages and cultures, and the need for solutions to handle low-resource settings. The survey concludes with a forward-looking perspective, outlining potential avenues for future research, including the exploration of novel AI methodologies, the ethical governance of AI in moderation, and the development of more nuanced, context-aware systems. This comprehensive overview aims to catalyze further research and foster a collaborative effort towards more sophisticated, responsible, and human-centric approaches to HS moderation in the digital era. WARNING: This paper contains offensive examples. △ Less

Submitted 1 February, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: Preprint; Under-Review

arXiv:2401.08007 [pdf, ps, other]

Strongly Dense Representations of Hyperbolic 3-Manifold Groups

Authors: Ricky Lee

Abstract: We provide the first examples of strongly dense representations of a hyperbolic 3-manifold group into $SL(4,\mathbb{R})$ and $SU(3,1)$ i.e. representations where every pair of non-commuting elements has Zariski dense image. Our examples are holonomy representations arising from projective deformations of its hyperbolic structure. As a Corollary, we get that $SL(4,\mathbb{R})$ has non-Hitchin stron… ▽ More We provide the first examples of strongly dense representations of a hyperbolic 3-manifold group into $SL(4,\mathbb{R})$ and $SU(3,1)$ i.e. representations where every pair of non-commuting elements has Zariski dense image. Our examples are holonomy representations arising from projective deformations of its hyperbolic structure. As a Corollary, we get that $SL(4,\mathbb{R})$ has non-Hitchin strongly dense surface subgroups. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: 11 pages

arXiv:2401.07856 [pdf]

doi 10.1126/sciadv.adn9420

Information hiding cameras: optical concealment of object information into ordinary images

Authors: Bijie Bai, Ryan Lee, Yuhang Li, Tianyi Gan, Yuntian Wang, Mona Jarrahi, Aydogan Ozcan

Abstract: Data protection methods like cryptography, despite being effective, inadvertently signal the presence of secret communication, thereby drawing undue attention. Here, we introduce an optical information hiding camera integrated with an electronic decoder, optimized jointly through deep learning. This information hiding-decoding system employs a diffractive optical processor as its front-end, which… ▽ More Data protection methods like cryptography, despite being effective, inadvertently signal the presence of secret communication, thereby drawing undue attention. Here, we introduce an optical information hiding camera integrated with an electronic decoder, optimized jointly through deep learning. This information hiding-decoding system employs a diffractive optical processor as its front-end, which transforms and hides input images in the form of ordinary-looking patterns that deceive/mislead human observers. This information hiding transformation is valid for infinitely many combinations of secret messages, all of which are transformed into ordinary-looking output patterns, achieved all-optically through passive light-matter interactions within the optical processor. By processing these ordinary-looking output images, a jointly-trained electronic decoder neural network accurately reconstructs the original information hidden within the deceptive output pattern. We numerically demonstrated our approach by designing an information hiding diffractive camera along with a jointly-optimized convolutional decoder neural network. The efficacy of this system was demonstrated under various lighting conditions and noise levels, showing its robustness. We further extended this information hiding camera to multi-spectral operation, allowing the concealment and decoding of multiple images at different wavelengths, all performed simultaneously in a single feed-forward operation. The feasibility of our framework was also demonstrated experimentally using THz radiation. This optical encoder-electronic decoder-based co-design provides a novel information hiding camera interface that is both high-speed and energy-efficient, offering an intriguing solution for visual information security. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: 26 Pages, 8 Figures

Journal ref: Science Advances (2024)

arXiv:2401.06498 [pdf, other]

doi 10.1145/3636555.3636906

Temporal and Between-Group Variability in College Dropout Prediction

Authors: Dominik Glandorf, Hye Rin Lee, Gabe Avakian Orona, Marina Pumptow, Renzhe Yu, Christian Fischer

Abstract: Large-scale administrative data is a common input in early warning systems for college dropout in higher education. Still, the terminology and methodology vary significantly across existing studies, and the implications of different modeling decisions are not fully understood. This study provides a systematic evaluation of contributing factors and predictive performance of machine learning models… ▽ More Large-scale administrative data is a common input in early warning systems for college dropout in higher education. Still, the terminology and methodology vary significantly across existing studies, and the implications of different modeling decisions are not fully understood. This study provides a systematic evaluation of contributing factors and predictive performance of machine learning models over time and across different student groups. Drawing on twelve years of administrative data at a large public university in the US, we find that dropout prediction at the end of the second year has a 20% higher AUC than at the time of enrollment in a Random Forest model. Also, most predictive factors at the time of enrollment, including demographics and high school performance, are quickly superseded in predictive importance by college performance and in later stages by enrollment behavior. Regarding variability across student groups, college GPA has more predictive value for students from traditionally disadvantaged backgrounds than their peers. These results can help researchers and administrators understand the comparative value of different data sources when building early warning systems and optimizing decisions under specific policy goals. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: Full paper accepted to Learning Analytics and Knowledge (LAK 2024)

Showing 1–50 of 701 results for author: Lee, R