Search | arXiv e-print repository

Transfer Learning with Pseudo Multi-Label Birdcall Classification for DS@GT BirdCLEF 2024

Authors: Anthony Miyaguchi, Adrian Cheung, Murilo Gustineli, Ashley Kim

Abstract: We present working notes for the DS@GT team on transfer learning with pseudo multi-label birdcall classification for the BirdCLEF 2024 competition, focused on identifying Indian bird species in recorded soundscapes. Our approach utilizes production-grade models such as the Google Bird Vocalization Classifier, BirdNET, and EnCodec to address representation and labeling challenges in the competition… ▽ More We present working notes for the DS@GT team on transfer learning with pseudo multi-label birdcall classification for the BirdCLEF 2024 competition, focused on identifying Indian bird species in recorded soundscapes. Our approach utilizes production-grade models such as the Google Bird Vocalization Classifier, BirdNET, and EnCodec to address representation and labeling challenges in the competition. We explore the distributional shift between this year's edition of unlabeled soundscapes representative of the hidden test set and propose a pseudo multi-label classification strategy to leverage the unlabeled data. Our highest post-competition public leaderboard score is 0.63 using BirdNET embeddings with Bird Vocalization pseudo-labels. Our code is available at https://github.com/dsgt-kaggle-clef/birdclef-2024 △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Submitted and accepted into CLEF 2024 CEUR-WS proceedings

arXiv:2407.06174 [pdf, other]

The Tug-of-War Between Deepfake Generation and Detection

Authors: Hannah Lee, Changyeon Lee, Kevin Farhat, Lin Qiu, Steve Geluso, Aerin Kim, Oren Etzioni

Abstract: Multimodal generative models are rapidly evolving, leading to a surge in the generation of realistic video and audio that offers exciting possibilities but also serious risks. Deepfake videos, which can convincingly impersonate individuals, have particularly garnered attention due to their potential misuse in spreading misinformation and creating fraudulent content. This survey paper examines the… ▽ More Multimodal generative models are rapidly evolving, leading to a surge in the generation of realistic video and audio that offers exciting possibilities but also serious risks. Deepfake videos, which can convincingly impersonate individuals, have particularly garnered attention due to their potential misuse in spreading misinformation and creating fraudulent content. This survey paper examines the dual landscape of deepfake video generation and detection, emphasizing the need for effective countermeasures against potential abuses. We provide a comprehensive overview of current deepfake generation techniques, including face swapping, reenactment, and audio-driven animation, which leverage cutting-edge technologies like generative adversarial networks and diffusion models to produce highly realistic fake videos. Additionally, we analyze various detection approaches designed to differentiate authentic from altered videos, from detecting visual artifacts to deploying advanced algorithms that pinpoint inconsistencies across video and audio signals. The effectiveness of these detection methods heavily relies on the diversity and quality of datasets used for training and evaluation. We discuss the evolution of deepfake datasets, highlighting the importance of robust, diverse, and frequently updated collections to enhance the detection accuracy and generalizability. As deepfakes become increasingly indistinguishable from authentic content, developing advanced detection techniques that can keep pace with generation technologies is crucial. We advocate for a proactive approach in the "tug-of-war" between deepfake creators and detectors, emphasizing the need for continuous research collaboration, standardization of evaluation metrics, and the creation of comprehensive benchmarks. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05820 [pdf, other]

Co-RaL: Complementary Radar-Leg Odometry with 4-DoF Optimization and Rolling Contact

Authors: Sangwoo Jung, Wooseong Yang, Ayoung Kim

Abstract: Robust and accurate localization in challenging environments is becoming crucial for SLAM. In this paper, we propose a unique sensor configuration for precise and robust odometry by integrating chip radar and a legged robot. Specifically, we introduce a tightly coupled radar-leg odometry algorithm for complementary drift correction. Adopting the 4-DoF optimization and decoupled RANSAC to mmWave ch… ▽ More Robust and accurate localization in challenging environments is becoming crucial for SLAM. In this paper, we propose a unique sensor configuration for precise and robust odometry by integrating chip radar and a legged robot. Specifically, we introduce a tightly coupled radar-leg odometry algorithm for complementary drift correction. Adopting the 4-DoF optimization and decoupled RANSAC to mmWave chip radar significantly enhances radar odometry beyond the existing method, especially z-directional even when using a single radar. For the leg odometry, we employ rolling contact modeling-aided forward kinematics, accommodating scenarios with the potential possibility of contact drift and radar failure. We evaluate our method by comparing it with other chip radar odometry algorithms using real-world datasets with diverse environments while the datasets will be released for the robotics community. https://github.com/SangwooJung98/Co-RaL-Dataset △ Less

Submitted 10 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

Comments: IROS 2024 accepted, 8 pages, 7 figures, 4 Tables

arXiv:2407.03684 [pdf, other]

ConPR: Ongoing Construction Site Dataset for Place Recognition

Authors: Dongjae Lee, Minwoo Jung, Ayoung Kim

Abstract: Place recognition, an essential challenge in computer vision and robotics, involves identifying previously visited locations. Despite algorithmic progress, challenges related to appearance change persist, with existing datasets often focusing on seasonal and weather variations but overlooking terrain changes. Understanding terrain alterations becomes critical for effective place recognition, given… ▽ More Place recognition, an essential challenge in computer vision and robotics, involves identifying previously visited locations. Despite algorithmic progress, challenges related to appearance change persist, with existing datasets often focusing on seasonal and weather variations but overlooking terrain changes. Understanding terrain alterations becomes critical for effective place recognition, given the aging infrastructure and ongoing city repairs. For real-world applicability, the comprehensive evaluation of algorithms must consider spatial dynamics. To address existing limitations, we present a novel multi-session place recognition dataset acquired from an active construction site. Our dataset captures ongoing construction progress through multiple data collections, facilitating evaluation in dynamic environments. It includes camera images, LiDAR point cloud data, and IMU data, enabling visual and LiDAR-based place recognition techniques, and supporting sensor fusion. Additionally, we provide ground truth information for range-based place recognition evaluation. Our dataset aims to advance place recognition algorithms in challenging and dynamic settings. Our dataset is available at https://github.com/dongjae0107/ConPR. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 3 pages, 2 figures, IROS 2023 Workshop on Closing the Loop on Localization: What Are We Localizing For, and How Does That Shape Everything We Should Do?

arXiv:2407.00759 [pdf]

Analysis of Modern Computer Vision Models for Blood Cell Classification

Authors: Alexander Kim, Ryan Kim

Abstract: The accurate classification of white blood cells and related blood components is crucial for medical diagnoses. While traditional manual examinations and automated hematology analyzers have been widely used, they are often slow and prone to errors. Recent advancements in deep learning have shown promise for addressing these limitations. Earlier studies have demonstrated the viability of convolutio… ▽ More The accurate classification of white blood cells and related blood components is crucial for medical diagnoses. While traditional manual examinations and automated hematology analyzers have been widely used, they are often slow and prone to errors. Recent advancements in deep learning have shown promise for addressing these limitations. Earlier studies have demonstrated the viability of convolutional neural networks such as DenseNet, ResNet, and VGGNet for this task. Building on these foundations, our work employs more recent and efficient models to achieve rapid and accurate results. Specifically, this study used state-of-the-art architectures, including MaxVit, EfficientVit, EfficientNet, EfficientNetV2, and MobileNetV3. This study aimed to evaluate the performance of these models in WBC classification, potentially offering a more efficient and reliable alternative to current methods. Our approach not only addresses the speed and accuracy concerns of traditional techniques but also explores the applicability of innovative deep learning models in hematological analysis. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 18 pages, 8 figures

ACM Class: I.4.9

arXiv:2406.16192 [pdf, other]

HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image Analysis

Authors: Guillaume Jaume, Paul Doucet, Andrew H. Song, Ming Y. Lu, Cristina Almagro-Pérez, Sophia J. Wagner, Anurag J. Vaidya, Richard J. Chen, Drew F. K. Williamson, Ahrong Kim, Faisal Mahmood

Abstract: Spatial transcriptomics (ST) enables interrogating the molecular composition of tissue with ever-increasing resolution, depth, and sensitivity. However, costs, rapidly evolving technology, and lack of standards have constrained computational methods in ST to narrow tasks and small cohorts. In addition, the underlying tissue morphology as reflected by H&E-stained whole slide images (WSIs) encodes r… ▽ More Spatial transcriptomics (ST) enables interrogating the molecular composition of tissue with ever-increasing resolution, depth, and sensitivity. However, costs, rapidly evolving technology, and lack of standards have constrained computational methods in ST to narrow tasks and small cohorts. In addition, the underlying tissue morphology as reflected by H&E-stained whole slide images (WSIs) encodes rich information often overlooked in ST studies. Here, we introduce HEST-1k, a collection of 1,108 spatial transcriptomic profiles, each linked to a WSI and metadata. HEST-1k was assembled using HEST-Library from 131 public and internal cohorts encompassing 25 organs, two species (Homo Sapiens and Mus Musculus), and 320 cancer samples from 25 cancer types. HEST-1k processing enabled the identification of 1.5 million expression--morphology pairs and 60 million nuclei. HEST-1k is tested on three use cases: (1) benchmarking foundation models for histopathology (HEST-Benchmark), (2) biomarker identification, and (3) multimodal representation learning. HEST-1k, HEST-Library, and HEST-Benchmark can be freely accessed via https://github.com/mahmoodlab/hest. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: Under review

arXiv:2406.15539 [pdf, other]

First Measurement of Deeply Virtual Compton Scattering on the Neutron with Detection of the Active Neutron

Authors: CLAS Collaboration, A. Hobart, S. Niccolai, M. Čuić, K. Kumerički, P. Achenbach, J. S. Alvarado, W. R. Armstrong, H. Atac, H. Avakian, L. Baashen, N. A. Baltzell, L. Barion, M. Bashkanov, M. Battaglieri, B. Benkel, F. Benmokhtar, A. Bianconi, A. S. Biselli, S. Boiarinov, M. Bondi, W. A. Booth, F. Bossù, K. -Th. Brinkmann, W. J. Briscoe , et al. (124 additional authors not shown)

Abstract: Measuring Deeply Virtual Compton Scattering on the neutron is one of the necessary steps to understand the structure of the nucleon in terms of Generalized Parton Distributions (GPDs). Neutron targets play a complementary role to transversely polarized proton targets in the determination of the GPD $E$. This poorly known and poorly constrained GPD is essential to obtain the contribution of the qua… ▽ More Measuring Deeply Virtual Compton Scattering on the neutron is one of the necessary steps to understand the structure of the nucleon in terms of Generalized Parton Distributions (GPDs). Neutron targets play a complementary role to transversely polarized proton targets in the determination of the GPD $E$. This poorly known and poorly constrained GPD is essential to obtain the contribution of the quarks' angular momentum to the spin of the nucleon. DVCS on the neutron was measured for the first time selecting the exclusive final state by detecting the neutron, using the Jefferson Lab longitudinally polarized electron beam, with energies up to 10.6 GeV, and the CLAS12 detector. The extracted beam-spin asymmetries, combined with DVCS observables measured on the proton, allow a clean quark-flavor separation of the imaginary parts of the GPDs $H$ and $E$. △ Less

Submitted 25 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

Comments: 7 pages, 6 figures

Report number: JLAB-PHY-24-4089

arXiv:2406.00856 [pdf, other]

DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection

Authors: Yewon Lim, Changyeon Lee, Aerin Kim, Oren Etzioni

Abstract: A dramatic influx of diffusion-generated images has marked recent years, posing unique challenges to current detection technologies. While the task of identifying these images falls under binary classification, a seemingly straightforward category, the computational load is significant when employing the "reconstruction then compare" technique. This approach, known as DIRE (Diffusion Reconstructio… ▽ More A dramatic influx of diffusion-generated images has marked recent years, posing unique challenges to current detection technologies. While the task of identifying these images falls under binary classification, a seemingly straightforward category, the computational load is significant when employing the "reconstruction then compare" technique. This approach, known as DIRE (Diffusion Reconstruction Error), not only identifies diffusion-generated images but also detects those produced by GANs, highlighting the technique's broad applicability. To address the computational challenges and improve efficiency, we propose distilling the knowledge embedded in diffusion models to develop rapid deepfake detection models. Our approach, aimed at creating a small, fast, cheap, and lightweight diffusion synthesized deepfake detector, maintains robust performance while significantly reducing operational demands. Maintaining performance, our experimental results indicate an inference speed 3.2 times faster than the existing DIRE framework. This advance not only enhances the practicality of deploying these systems in real-world settings but also paves the way for future research endeavors that seek to leverage diffusion model knowledge. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 6 pages, 1 figure

arXiv:2405.18589 [pdf, other]

Candidate strongly-lensed Type Ia supernovae in the Zwicky Transient Facility archive

Authors: A. Townsend, J. Nordin, A. Sagués Carracedo, M. Kowalski, N. Arendse, S. Dhawan, A. Goobar, J. Johansson, E. Mörtsell, S. Schulze, I. Andreoni, E. Fernández, A. G. Kim, P. E. Nugent, F. Prada, M. Rigault, N. Sarin, D. Sharma, E. C. Bellm, M. W. Coughlin, R. Dekany, S. L. Groom, L. Lacroix, R. R. Laher, R. Riddle , et al. (39 additional authors not shown)

Abstract: Gravitationally lensed Type Ia supernovae (glSNe Ia) are unique astronomical tools for studying cosmological parameters, distributions of dark matter, the astrophysics of the supernovae and the intervening lensing galaxies themselves. Only a few highly magnified glSNe Ia have been discovered by ground-based telescopes, such as the Zwicky Transient Facility (ZTF), but simulations predict the existe… ▽ More Gravitationally lensed Type Ia supernovae (glSNe Ia) are unique astronomical tools for studying cosmological parameters, distributions of dark matter, the astrophysics of the supernovae and the intervening lensing galaxies themselves. Only a few highly magnified glSNe Ia have been discovered by ground-based telescopes, such as the Zwicky Transient Facility (ZTF), but simulations predict the existence of a fainter, undetected population. We present a systematic search in the ZTF archive of alerts from 1 June 2019 to 1 September 2022. Using the AMPEL platform, we developed a pipeline that distinguishes candidate glSNe Ia from other variable sources. Initial cuts were applied to the ZTF alert photometry before forced photometry was obtained for the remaining candidates. Additional cuts were applied to refine the candidates based on their light curve colours, lens galaxy colours, and the resulting parameters from fits to the SALT2 SN Ia template. Candidates were also cross-matched with the DESI spectroscopic catalogue. Seven transients passed all the cuts and had an associated galaxy DESI redshift, which we present as glSN Ia candidates. While superluminous supernovae (SLSNe) cannot be fully rejected, two events, ZTF19abpjicm and ZTF22aahmovu, are significantly different from typical SLSNe and their light curves can be modelled as two-image glSN Ia systems. From this two-image modelling, we estimate time delays of 22 $\pm$ 3 and 34 $\pm$ 1 days for the two events, respectively, which suggests that we have uncovered a population with longer time delays. The pipeline is efficient and sensitive enough to parse full alert streams. It is currently being applied to the live ZTF alert stream to identify and follow-up future candidates while active. This pipeline could be the foundation for glSNe Ia searches in future surveys, like the Vera C. Rubin Observatory's Legacy Survey of Space and Time. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 21 pages, 15 figures

arXiv:2405.15395 [pdf, other]

Fieldscale: Locality-Aware Field-based Adaptive Rescaling for Thermal Infrared Image

Authors: Hyeonjae Gil, Myung-Hwan Jeon, Ayoung Kim

Abstract: Thermal infrared (TIR) cameras are emerging as promising sensors in safety-related fields due to their robustness against external illumination. However, RAW TIR image has 14 bits of pixel depth and needs to be rescaled into 8 bits for general applications. Previous works utilize a global 1D look-up table to compute pixel-wise gain solely based on its intensity, which degrades image quality by fai… ▽ More Thermal infrared (TIR) cameras are emerging as promising sensors in safety-related fields due to their robustness against external illumination. However, RAW TIR image has 14 bits of pixel depth and needs to be rescaled into 8 bits for general applications. Previous works utilize a global 1D look-up table to compute pixel-wise gain solely based on its intensity, which degrades image quality by failing to consider the local nature of the heat. We propose Fieldscale, a rescaling based on locality-aware 2D fields where both the intensity value and spatial context of each pixel within an image are embedded. It can adaptively determine the pixel gain for each region and produce spatially consistent 8-bit rescaled images with minimal information loss and high visibility. Consistent performance improvement on image quality assessment and two other downstream tasks support the effectiveness and usability of Fieldscale. All the codes are publicly opened to facilitate research advancements in this field. https://github.com/hyeonjaegil/fieldscale △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 9 pages, 8 figures, accepted to RA-L

arXiv:2405.15386 [pdf, other]

Exploring Baryon Resonances with Transition Generalized Parton Distributions: Status and Perspectives

Authors: Stefan Diehl, Kyungseon Joo, Kirill Semenov-Tian-Shansky, Christian Weiss, Vladimir Braun, Wen-Chen Chang, Pierre Chatagnon, Martha Constantinou, Yuxun Guo, Parada T. P. Hutauruk, Hyon-Suk Jo, Andrey Kim, Jun-Young Kim, Peter Kroll, Shunzo Kumano, Chang-Hwan Lee, Simonetta Liuti, Ronan McNulty, Hyeon-Dong Son, Pawel Sznajder, Ali Usman, Charlotte Van Hulse, Marc Vanderhaeghen, Michael Winn

Abstract: QCD gives rise to a rich spectrum of excited baryon states. Understanding their internal structure is important for many areas of nuclear physics, such as nuclear forces, dense matter, and neutrino-nucleus interactions. Generalized parton distributions (GPDs) are an established tool for characterizing the QCD structure of the ground-state nucleon. They are used to create 3D tomographic images of t… ▽ More QCD gives rise to a rich spectrum of excited baryon states. Understanding their internal structure is important for many areas of nuclear physics, such as nuclear forces, dense matter, and neutrino-nucleus interactions. Generalized parton distributions (GPDs) are an established tool for characterizing the QCD structure of the ground-state nucleon. They are used to create 3D tomographic images of the quark/gluon structure and quantify the mechanical properties such as the distribution of mass, angular momentum and forces in the system. Transition GPDs extend these concepts to $N \rightarrow N^\ast$ transitions and can be used to characterize the 3D structure and mechanical properties of baryon resonances. They can be probed in high-momentum-transfer exclusive electroproduction processes with resonance transitions $e + N \rightarrow e' + M + N^\ast$, such as deeply-virtual Compton scattering ($M = γ$) or meson production ($M = π, K$, $etc.$), and in related photon/hadron-induced processes. This White Paper describes a research program aiming to explore baryon resonance structure with transition GPDs. This includes the properties and interpretation of the transition GPDs, theoretical methods for structures and processes, first experimental results from JLab 12 GeV, future measurements with existing and planned facilities (JLab detector and energy upgrades, COMPASS/AMBER, EIC, EicC, J-PARC, LHC ultraperihperal collisions), and the theoretical and experimental developments needed to realize this program. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Report number: JLAB-THY-24-4071

arXiv:2405.09935 [pdf, other]

DEBATE: Devil's Advocate-Based Assessment and Text Evaluation

Authors: Alex Kim, Keonwoo Kim, Sangwon Yoon

Abstract: As natural language generation (NLG) models have become prevalent, systematically assessing the quality of machine-generated texts has become increasingly important. Recent studies introduce LLM-based evaluators that operate as reference-free metrics, demonstrating their capability to adeptly handle novel tasks. However, these models generally rely on a single-agent approach, which, we argue, intr… ▽ More As natural language generation (NLG) models have become prevalent, systematically assessing the quality of machine-generated texts has become increasingly important. Recent studies introduce LLM-based evaluators that operate as reference-free metrics, demonstrating their capability to adeptly handle novel tasks. However, these models generally rely on a single-agent approach, which, we argue, introduces an inherent limit to their performance. This is because there exist biases in LLM agent's responses, including preferences for certain text structure or content. In this work, we propose DEBATE, an NLG evaluation framework based on multi-agent scoring system augmented with a concept of Devil's Advocate. Within the framework, one agent is instructed to criticize other agents' arguments, potentially resolving the bias in LLM agent's answers. DEBATE substantially outperforms the previous state-of-the-art methods in two meta-evaluation benchmarks in NLG evaluation, SummEval and TopicalChat. We also show that the extensiveness of debates among agents and the persona of an agent can influence the performance of evaluators. △ Less

Submitted 23 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.04216 [pdf, other]

DESI 2024: Reconstructing Dark Energy using Crossing Statistics with DESI DR1 BAO data

Authors: R. Calderon, K. Lodha, A. Shafieloo, E. Linder, W. Sohn, A. de Mattia, J. L. Cervantes-Cota, R. Crittenden, T. M. Davis, M. Ishak, A. G. Kim, W. Matthewson, G. Niz, S. Park, J. Aguilar, S. Ahlen, S. Allen, D. Brooks, T. Claybaugh, A. de la Macorra, A. Dey, B. Dey, P. Doel, J. E. Forero-Romero, E. Gaztañaga , et al. (30 additional authors not shown)

Abstract: We implement Crossing Statistics to reconstruct in a model-agnostic manner the expansion history of the universe and properties of dark energy, using DESI Data Release 1 (DR1) BAO data in combination with one of three different supernova compilations (PantheonPlus, Union3, and DES-SN5YR) and Planck CMB observations. Our results hint towards an evolving and emergent dark energy behaviour, with negl… ▽ More We implement Crossing Statistics to reconstruct in a model-agnostic manner the expansion history of the universe and properties of dark energy, using DESI Data Release 1 (DR1) BAO data in combination with one of three different supernova compilations (PantheonPlus, Union3, and DES-SN5YR) and Planck CMB observations. Our results hint towards an evolving and emergent dark energy behaviour, with negligible presence of dark energy at $z\gtrsim 1$, at varying significance depending on data sets combined. In all these reconstructions, the cosmological constant lies outside the 95\% confidence intervals for some redshift ranges. This dark energy behaviour, reconstructed using Crossing Statistics, is in agreement with results from the conventional $w_0$--$w_a$ dark energy equation of state parametrization reported in the DESI Key cosmology paper. Our results add an extensive class of model-agnostic reconstructions with acceptable fits to the data, including models where cosmic acceleration slows down at low redshifts. We also report constraints on \Hord\ from our model-agnostic analysis, independent of the pre-recombination physics. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 24 pages, 10 figures

arXiv:2405.03857 [pdf, other]

The MOST Hosts Survey: spectroscopic observation of the host galaxies of ~40,000 transients using DESI

Authors: Maayane T. Soumagnac, Peter Nugent, Robert A. Knop, Anna Y. Q. Ho, William Hohensee, Autumn Awbrey, Alexis Andersen, Greg Aldering, Matan Ventura, Jessica N. Aguilar, Steven Ahlen, Segev Y. Benzvi, David Brooks, Dillon Brout, Todd Claybaugh, Tamara M. Davis, Kyle Dawson, Axel de la Macorra, Arjun Dey, Biprateep Dey, Peter Doel, Kelly A. Douglass, Jaime E. Forero-Romero, Enrique Gaztanaga, Satya Gontcho A Gontcho , et al. (32 additional authors not shown)

Abstract: We present the MOST Hosts survey (Multi-Object Spectroscopy of Transient Hosts). The survey is planned to run throughout the five years of operation of the Dark Energy Spectroscopic Instrument (DESI) and will generate a spectroscopic catalog of the hosts of most transients observed to date, in particular all the supernovae observed by most public, untargeted, wide-field, optical surveys (PTF/iPTF,… ▽ More We present the MOST Hosts survey (Multi-Object Spectroscopy of Transient Hosts). The survey is planned to run throughout the five years of operation of the Dark Energy Spectroscopic Instrument (DESI) and will generate a spectroscopic catalog of the hosts of most transients observed to date, in particular all the supernovae observed by most public, untargeted, wide-field, optical surveys (PTF/iPTF, SDSS II, ZTF, DECAT, DESIRT). Scientific questions for which the MOST Hosts survey will be useful include Type Ia supernova cosmology, fundamental plane and peculiar velocity measurements, and the understanding of the correlations between transients and their host galaxy properties. Here, we present the first release of the MOST Hosts survey: 21,931 hosts of 20,235 transients. These numbers represent 36% of the final MOST Hosts sample, consisting of 60,212 potential host galaxies of 38,603 transients (a transient can be assigned multiple potential hosts). Of these galaxies, 40% do not appear in the DESI primary target list and therefore require a specific program like MOST Hosts. Of all the transients in the MOST Hosts list, only 26.7% have existing classifications, and so the survey will provide redshifts (and luminosities) for nearly 30,000 transients. A preliminary Hubble diagram and a transient luminosity-duration diagram are shown as examples of future potential uses of the MOST Hosts survey. The survey will also provide a training sample of spectroscopically observed transients for photometry-only classifiers, as we enter an era when most newly observed transients will lack spectroscopic classification. The MOST Hosts DESI survey data will be released through the Wiserep platform on a rolling cadence and updated to match the DESI releases. Dates of future releases and updates are available through the https://mosthosts.desi.lbl.gov website. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Submitted to ApJS

arXiv:2405.00029 [pdf, ps, other]

Automatic Creative Selection with Cross-Modal Matching

Authors: Alex Kim, Jia Huang, Rob Monarch, Jerry Kwac, Anikesh Kamath, Parmeshwar Khurd, Kailash Thiyagarajan, Goodman Gu

Abstract: Application developers advertise their Apps by creating product pages with App images, and bidding on search terms. It is then crucial for App images to be highly relevant with the search terms. Solutions to this problem require an image-text matching model to predict the quality of the match between the chosen image and the search terms. In this work, we present a novel approach to matching an Ap… ▽ More Application developers advertise their Apps by creating product pages with App images, and bidding on search terms. It is then crucial for App images to be highly relevant with the search terms. Solutions to this problem require an image-text matching model to predict the quality of the match between the chosen image and the search terms. In this work, we present a novel approach to matching an App image to search terms based on fine-tuning a pre-trained LXMERT model. We show that compared to the CLIP model and a baseline using a Transformer model for search terms, and a ResNet model for images, we significantly improve the matching accuracy. We evaluate our approach using two sets of labels: advertiser associated (image, search term) pairs for a given application, and human ratings for the relevance between (image, search term) pairs. Our approach achieves 0.96 AUC score for advertiser associated ground truth, outperforming the transformer+ResNet baseline and the fine-tuned CLIP model by 8% and 14%. For human labeled ground truth, our approach achieves 0.95 AUC score, outperforming the transformer+ResNet baseline and the fine-tuned CLIP model by 16% and 17%. △ Less

Submitted 28 February, 2024; originally announced May 2024.

arXiv:2404.17756 [pdf, other]

Suppressed self-diffusion of nanoscale constituents of a complex liquid

Authors: Christian P. N. Tanner, Vivian R. K. Wall, Mumtaz Gababa, Joshua Portner, Ahhyun Jeong, Matthew J. Hurley, Nicholas Leonard, Jonathan G. Raybin, James K. Utterback, Ahyoung Kim, Andrei Fluerasu, Yanwen Sun, Johannes Moeller, Alexey Zozulya, Wonhyuk Jo, Anders Madsen, Dmitri V. Talapin, Samuel W. Teitelbaum, Naomi S. Ginsberg

Abstract: The ability to understand and ultimately control the transformations and properties of various nanoscale systems, from proteins to synthetic nanomaterial assemblies, hinges on the ability to directly elucidate their dynamics on their characteristic length and time scales. Here, we use MHz X-ray photon correlation spectroscopy (XPCS) to directly elucidate the characteristic microsecond-dynamics of… ▽ More The ability to understand and ultimately control the transformations and properties of various nanoscale systems, from proteins to synthetic nanomaterial assemblies, hinges on the ability to directly elucidate their dynamics on their characteristic length and time scales. Here, we use MHz X-ray photon correlation spectroscopy (XPCS) to directly elucidate the characteristic microsecond-dynamics of density fluctuations of semiconductor nanocrystals (NCs), not only in a colloidal dispersion but also in a liquid phase consisting of densely packed, yet mobile, NCs with no long-range order. By carefully disentangling X-ray induced effects, we find the wavevector-dependent fluctuation rates in the liquid phase are suppressed relative to those in the colloidal phase and to those in experiments and hydrodynamic theories of densely packed repulsive particles. We show that the suppressed rates are due to a substantial decrease in the self-diffusion of NCs in the liquid phase, which we attribute to explicit attractive interactions. Via comparison with simulations, we find that the extracted strength of the attractions explains the stability of the liquid phase, in contrast to the gelation observed via XPCS in many other charged colloidal systems. This work opens the door to elucidating fast, condensed phase dynamics in a variety of complex fluids and other nanoscale soft matter systems, such as densely packed proteins and non-equilibrium self-assembly processes. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 15 pages, 4 figures

arXiv:2404.16397 [pdf, other]

Deep Learning-based Prediction of Breast Cancer Tumor and Immune Phenotypes from Histopathology

Authors: Tiago Gonçalves, Dagoberto Pulido-Arias, Julian Willett, Katharina V. Hoebel, Mason Cleveland, Syed Rakin Ahmed, Elizabeth Gerstner, Jayashree Kalpathy-Cramer, Jaime S. Cardoso, Christopher P. Bridge, Albert E. Kim

Abstract: The interactions between tumor cells and the tumor microenvironment (TME) dictate therapeutic efficacy of radiation and many systemic therapies in breast cancer. However, to date, there is not a widely available method to reproducibly measure tumor and immune phenotypes for each patient's tumor. Given this unmet clinical need, we applied multiple instance learning (MIL) algorithms to assess activi… ▽ More The interactions between tumor cells and the tumor microenvironment (TME) dictate therapeutic efficacy of radiation and many systemic therapies in breast cancer. However, to date, there is not a widely available method to reproducibly measure tumor and immune phenotypes for each patient's tumor. Given this unmet clinical need, we applied multiple instance learning (MIL) algorithms to assess activity of ten biologically relevant pathways from the hematoxylin and eosin (H&E) slide of primary breast tumors. We employed different feature extraction approaches and state-of-the-art model architectures. Using binary classification, our models attained area under the receiver operating characteristic (AUROC) scores above 0.70 for nearly all gene expression pathways and on some cases, exceeded 0.80. Attention maps suggest that our trained models recognize biologically relevant spatial patterns of cell sub-populations from H&E. These efforts represent a first step towards developing computational H&E biomarkers that reflect facets of the TME and hold promise for augmenting precision oncology. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: Paper accepted at the First Workshop on Imageomics (Imageomics-AAAI-24) - Discovering Biological Knowledge from Images using AI (https://sites.google.com/vt.edu/imageomics-aaai-24/home), held as part of the 38th Annual AAAI Conference on Artificial Intelligence (https://aaai.org/aaai-conference/)

MSC Class: 92C55 ACM Class: I.5.1; I.5.4; I.2.10; J.3

arXiv:2404.15389 [pdf, other]

Detecting unresolved lensed SNe Ia in LSST using blended light curves

Authors: Satadru Bag, Simon Huber, Sherry H. Suyu, Nikki Arendse, Irham Taufik Andika, Raoul Canameras, Alex Kim, Eric Linder, Kushal Lodha, Alejandra Melo, Anupreeta More, Stefan Schuldt, Arman Shafieloo

Abstract: Strong-gravitationally lensed supernovae (LSNe) are promising probes for providing absolute distance measurements using gravitational lens time delays. Spatially unresolved LSNe offer an opportunity to enhance the sample size for precision cosmology. We predict that there will be approximately $3$ times more unresolved than resolved LSNe Ia in the Legacy Survey of Space and Time (LSST) by the Rubi… ▽ More Strong-gravitationally lensed supernovae (LSNe) are promising probes for providing absolute distance measurements using gravitational lens time delays. Spatially unresolved LSNe offer an opportunity to enhance the sample size for precision cosmology. We predict that there will be approximately $3$ times more unresolved than resolved LSNe Ia in the Legacy Survey of Space and Time (LSST) by the Rubin Observatory. In this article, we explore the feasibility of detecting unresolved LSNe Ia from the shape of the observed blended light curves using deep learning techniques, and we find that $\sim 30\%$ can be detected with a simple 1D CNN using well-sampled $rizy$-band light curves (with a false-positive rate of $\sim 3\%$). Even when the light curve is well-observed in only a single band among $r$, $i$, and $z$, detection is still possible with false-positive rates ranging from $\sim 4-7\%$, depending on the band. Furthermore, we demonstrate that these unresolved cases can be detected at an early stage using light curves up to $\sim20$ days from the first observation, with well-controlled false-positive rates, providing ample opportunities for triggering follow-up observations. Additionally, we demonstrate the feasibility of time-delay estimations using solely LSST-like data of unresolved light curves, particularly for doubles, when excluding systems with low time delay and magnification ratio. However, the abundance of such systems among those unresolved in LSST poses a significant challenge. This approach holds potential utility for upcoming wide-field surveys, and overall results could significantly improve with enhanced cadence and depth in the future surveys. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 16 pages, 9 figures, submitted to A&A

arXiv:2404.13949 [pdf, other]

PeLiCal: Targetless Extrinsic Calibration via Penetrating Lines for RGB-D Cameras with Limited Co-visibility

Authors: Jaeho Shin, Seungsang Yun, Ayoung Kim

Abstract: RGB-D cameras are crucial in robotic perception, given their ability to produce images augmented with depth data. However, their limited FOV often requires multiple cameras to cover a broader area. In multi-camera RGB-D setups, the goal is typically to reduce camera overlap, optimizing spatial coverage with as few cameras as possible. The extrinsic calibration of these systems introduces additiona… ▽ More RGB-D cameras are crucial in robotic perception, given their ability to produce images augmented with depth data. However, their limited FOV often requires multiple cameras to cover a broader area. In multi-camera RGB-D setups, the goal is typically to reduce camera overlap, optimizing spatial coverage with as few cameras as possible. The extrinsic calibration of these systems introduces additional complexities. Existing methods for extrinsic calibration either necessitate specific tools or highly depend on the accuracy of camera motion estimation. To address these issues, we present PeLiCal, a novel line-based calibration approach for RGB-D camera systems exhibiting limited overlap. Our method leverages long line features from surroundings, and filters out outliers with a novel convergence voting algorithm, achieving targetless, real-time, and outlier-robust performance compared to existing methods. We open source our implementation on https://github.com/joomeok/PeLiCal.git. △ Less

Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.13852 [pdf, other]

Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding

Authors: Eunho Lee, Minwoo Jung, Ayoung Kim

Abstract: Robust 3D object detection is a core challenge for autonomous mobile systems in field robotics. To tackle this issue, many researchers have demonstrated improvements in 3D object detection performance in datasets. However, real-world urban scenarios with unstructured and dynamic situations can still lead to numerous false positives, posing a challenge for robust 3D object detection models. This pa… ▽ More Robust 3D object detection is a core challenge for autonomous mobile systems in field robotics. To tackle this issue, many researchers have demonstrated improvements in 3D object detection performance in datasets. However, real-world urban scenarios with unstructured and dynamic situations can still lead to numerous false positives, posing a challenge for robust 3D object detection models. This paper presents a post-processing algorithm that dynamically adjusts object detection thresholds based on the distance from the ego-vehicle. 3D object detection models usually perform well in detecting nearby objects but may exhibit suboptimal performance for distant ones. While conventional perception algorithms typically employ a single threshold in post-processing, the proposed algorithm addresses this issue by employing adaptive thresholds based on the distance from the ego-vehicle, minimizing false negatives and reducing false positives in urban scenarios. The results show performance enhancements in 3D object detection models across a range of scenarios, not only in dynamic urban road conditions but also in scenarios involving adverse weather conditions. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 5 pages, 4 figures, Accepted to the IEEE ICRA Workshop on Field Robotics 2024

arXiv:2404.13022 [pdf, other]

doi 10.1016/j.jma.2024.06.008

Machine Learning-guided accelerated discovery of structure-property correlations in lean magnesium alloys for biomedical applications

Authors: Sreenivas Raguraman, Maitreyee Sharma Priyadarshini, Tram Nguyen, Ryan McGovern, Andrew Kim, Adam J. Griebel, Paulette Clancy, Timothy P. Weihs

Abstract: Magnesium alloys are emerging as promising alternatives to traditional orthopedic implant materials thanks to their biodegradability, biocompatibility, and impressive mechanical characteristics. However, their rapid in-vivo degradation presents challenges, notably in upholding mechanical integrity over time. This study investigates the impact of high-temperature thermal processing on the mechanica… ▽ More Magnesium alloys are emerging as promising alternatives to traditional orthopedic implant materials thanks to their biodegradability, biocompatibility, and impressive mechanical characteristics. However, their rapid in-vivo degradation presents challenges, notably in upholding mechanical integrity over time. This study investigates the impact of high-temperature thermal processing on the mechanical and degradation attributes of a lean Mg-Zn-Ca-Mn alloy, ZX10. Utilizing rapid, cost-efficient characterization methods like X-ray diffraction and optical, we swiftly examine microstructural changes post-thermal treatment. Employing Pearson correlation coefficient analysis, we unveil the relationship between microstructural properties and critical targets (properties): hardness and corrosion resistance. Additionally, leveraging the least absolute shrinkage and selection operator (LASSO), we pinpoint the dominant microstructural factors among closely correlated variables. Our findings underscore the significant role of grain size refinement in strengthening and the predominance of the ternary Ca2Mg6Zn3 phase in corrosion behavior. This suggests that achieving an optimal blend of strength and corrosion resistance is attainable through fine grains and reduced concentration of ternary phases. This thorough investigation furnishes valuable insights into the intricate interplay of processing, structure, and properties in magnesium alloys, thereby advancing the development of superior biodegradable implant materials. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12770 [pdf, other]

Camera Agnostic Two-Head Network for Ego-Lane Inference

Authors: Chaehyeon Song, Sungho Yoon, Minhyeok Heo, Ayoung Kim, Sujung Kim

Abstract: Vision-based ego-lane inference using High-Definition (HD) maps is essential in autonomous driving and advanced driver assistance systems. The traditional approach necessitates well-calibrated cameras, which confines variation of camera configuration, as the algorithm relies on intrinsic and extrinsic calibration. In this paper, we propose a learning-based ego-lane inference by directly estimating… ▽ More Vision-based ego-lane inference using High-Definition (HD) maps is essential in autonomous driving and advanced driver assistance systems. The traditional approach necessitates well-calibrated cameras, which confines variation of camera configuration, as the algorithm relies on intrinsic and extrinsic calibration. In this paper, we propose a learning-based ego-lane inference by directly estimating the ego-lane index from a single image. To enhance robust performance, our model incorporates the two-head structure inferring ego-lane in two perspectives simultaneously. Furthermore, we utilize an attention mechanism guided by vanishing point-and-line to adapt to changes in viewpoint without requiring accurate calibration. The high adaptability of our model was validated in diverse environments, devices, and camera mounting points and orientations. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.09109 [pdf, other]

Optimizing Disjunctive Queries with Tagged Execution

Authors: Albert Kim, Samuel Madden

Abstract: Despite decades of research into query optimization, optimizing queries with disjunctive predicate expressions remains a challenge. Solutions employed by existing systems (if any) are often simplistic and lead to much redundant work being performed by the execution engine. To address these problems, we propose a novel form of query execution called tagged execution. Tagged execution groups tuples… ▽ More Despite decades of research into query optimization, optimizing queries with disjunctive predicate expressions remains a challenge. Solutions employed by existing systems (if any) are often simplistic and lead to much redundant work being performed by the execution engine. To address these problems, we propose a novel form of query execution called tagged execution. Tagged execution groups tuples into subrelations based on which predicates in the query they satisfy (or don't satisfy) and tags them with that information. These tags then provide additional context for query operators to take advantage of during runtime, allowing them to eliminate much of the redundant work performed by traditional engines and realize predicate pushdown optimizations for disjunctive predicates. However, tagged execution brings its own challenges, and the question of what tags to create is a nontrivial one. Careless creation of tags can lead to an exponential blowup in the tag space, with the overhead outweighing the benefits. To address this issue, we present a technique called tag generalization to minimize the space of tags. We implemented the tagged execution model with tag generalization in our system Basilisk, and our evaluation shows an average 2.7x speedup in runtime over the traditional execution model with up to a 19x speedup in certain situations. △ Less

Submitted 22 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.01954 [pdf, other]

HyperCLOVA X Technical Report

Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs. △ Less

Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 44 pages; updated authors list and fixed author names

arXiv:2404.00830 [pdf, other]

2D Ego-Motion with Yaw Estimation using Only mmWave Radars via Two-Way weighted ICP

Authors: Hojune Kim, Hyesu Jang, Ayoung Kim

Abstract: The interest in single-chip mmWave Radar is driven by their compact form factor, cost-effectiveness, and robustness under harsh environmental conditions. Despite its promising attributes, the principal limitation of mmWave radar lies in its capacity for autonomous yaw rate estimation. Conventional solutions have often resorted to integrating inertial measurement unit (IMU) or deploying multiple ra… ▽ More The interest in single-chip mmWave Radar is driven by their compact form factor, cost-effectiveness, and robustness under harsh environmental conditions. Despite its promising attributes, the principal limitation of mmWave radar lies in its capacity for autonomous yaw rate estimation. Conventional solutions have often resorted to integrating inertial measurement unit (IMU) or deploying multiple radar units to circumvent this shortcoming. This paper introduces an innovative methodology for two-dimensional ego-motion estimation, focusing on yaw rate deduction, utilizing solely mmWave radar sensors. By applying a weighted Iterated Closest Point (ICP) algorithm to register processed points derived from heatmap data, our method facilitates 2D ego-motion estimation devoid of prior information. Through experimental validation, we verified the effectiveness and promise of our technique for ego-motion estimation using exclusively radar data. △ Less

Submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.18358 [pdf]

doi 10.1007/s11370-023-00498-y

Imaging radar and LiDAR image translation for 3-DOF extrinsic calibration

Authors: Sangwoo Jung, Hyesu Jang, Minwoo Jung, Ayoung Kim, Myung-Hwan Jeon

Abstract: The integration of sensor data is crucial in the field of robotics to take full advantage of the various sensors employed. One critical aspect of this integration is determining the extrinsic calibration parameters, such as the relative transformation, between each sensor. The use of data fusion between complementary sensors, such as radar and LiDAR, can provide significant benefits, particularly… ▽ More The integration of sensor data is crucial in the field of robotics to take full advantage of the various sensors employed. One critical aspect of this integration is determining the extrinsic calibration parameters, such as the relative transformation, between each sensor. The use of data fusion between complementary sensors, such as radar and LiDAR, can provide significant benefits, particularly in harsh environments where accurate depth data is required. However, noise included in radar sensor data can make the estimation of extrinsic calibration challenging. To address this issue, we present a novel framework for the extrinsic calibration of radar and LiDAR sensors, utilizing CycleGAN as amethod of image-to-image translation. Our proposed method employs translating radar bird-eye-view images into LiDAR-style images to estimate the 3-DOF extrinsic parameters. The use of image registration techniques, as well as deskewing based on sensor odometry and B-spline interpolation, is employed to address the rolling shutter effect commonly present in spinning sensors. Our method demonstrates a notable improvement in extrinsic calibration compared to filter-based methods using the MulRan dataset. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.17441 [pdf, other]

Adaptive LiDAR-Radar Fusion for Outdoor Odometry Across Dense Smoke Conditions

Authors: Chiyun Noh, Ayoung Kim

Abstract: Robust odometry estimation in perceptually degraded environments represents a key challenge in the field of robotics. In this paper, we propose a LiDAR-radar fusion method for robust odometry for adverse environment with LiDAR degeneracy. By comparing the LiDAR point cloud with the radar static point cloud obtained through preprocessing module, it is possible to identify instances of LiDAR degener… ▽ More Robust odometry estimation in perceptually degraded environments represents a key challenge in the field of robotics. In this paper, we propose a LiDAR-radar fusion method for robust odometry for adverse environment with LiDAR degeneracy. By comparing the LiDAR point cloud with the radar static point cloud obtained through preprocessing module, it is possible to identify instances of LiDAR degeneracy to overcome perceptual limits. We demonstrate the effectiveness of our method in challenging conditions such as dense smoke, showcasing its ability to reliably estimate odometry and identify/remove dynamic points prone to LiDAR degeneracy. △ Less

Submitted 19 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.15725 [pdf, other]

Customizable wave tailoring materials enabled by nonlinear bilevel inverse design

Authors: Brianna MacNider, Haning Xiu, Kai Qian, Ian Frankel, Hyunsun Alicia Kim, Nicholas Boechler

Abstract: Passive transformation of waves via nonlinear systems is ubiquitous in settings ranging from acoustics to optics and electromagnetics. Passivity is of particular importance for responding rapidly to stimuli and nonlinearity enormously expands signal transformability compared to linear systems due to the breaking of superposition. It is well known that different types of nonlinearity yield vastly d… ▽ More Passive transformation of waves via nonlinear systems is ubiquitous in settings ranging from acoustics to optics and electromagnetics. Passivity is of particular importance for responding rapidly to stimuli and nonlinearity enormously expands signal transformability compared to linear systems due to the breaking of superposition. It is well known that different types of nonlinearity yield vastly different effects on propagating signals, which raises the question of ``what precise nonlinearity is the best for a given wave tailoring application?'' Considering a one-dimensional spring-mass chain as a testbed, we couple the shape optimization of structures for tailored nonlinear constitutive responses with reduced-order nonlinear dynamical inverse design. Using minimization of peak kinetic energy transmission from impact as a case study, we identify ideal nonlinear constitutive responses and the geometries needed to achieve them. As part of this, we show the large sensitivity of this metric to small changes in nonlinearity, and thus the need for high precision, free-form nonlinearity tailoring. We validate our predictions using impact experiments in a chain of nonlinear springs and masses. This work sets the foundation for broader passive nonlinear mechanical wave tailoring material design, with applications to computing, signal processing, shock mitigation, and autonomous materials. △ Less

Submitted 30 June, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

arXiv:2403.04583 [pdf, other]

Unbiased Estimator for Distorted Conics in Camera Calibration

Authors: Chaehyeon Song, Jaeho Shin, Myung-Hwan Jeon, Jongwoo Lim, Ayoung Kim

Abstract: In the literature, points and conics have been major features for camera geometric calibration. Although conics are more informative features than points, the loss of the conic property under distortion has critically limited the utility of conic features in camera calibration. Many existing approaches addressed conic-based calibration by ignoring distortion or introducing 3D spherical targets to… ▽ More In the literature, points and conics have been major features for camera geometric calibration. Although conics are more informative features than points, the loss of the conic property under distortion has critically limited the utility of conic features in camera calibration. Many existing approaches addressed conic-based calibration by ignoring distortion or introducing 3D spherical targets to circumvent this limitation. In this paper, we present a novel formulation for conic-based calibration using moments. Our derivation is based on the mathematical finding that the first moment can be estimated without bias even under distortion. This allows us to track moment changes during projection and distortion, ensuring the preservation of the first moment of the distorted conic. With an unbiased estimator, the circular patterns can be accurately detected at the sub-pixel level and can now be fully exploited for an entire calibration pipeline, resulting in significantly improved calibration. The entire code is readily available from https://github.com/ChaehyeonSong/discocal. △ Less

Submitted 9 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.02773 [pdf, other]

doi 10.1109/LRA.2024.3349960

LodeStar: Maritime Radar Descriptor for Semi-Direct Radar Odometry

Authors: Hyesu Jang, Minwoo Jung, Myung-Hwan Jeon, Ayoung Kim

Abstract: Maritime radars are prevalently adopted to capture the vessel's omnidirectional data as imagery. Nevertheless, inherent challenges persist with marine radars, including limited frequency, suboptimal resolution, and indeterminate detections. Additionally, the scarcity of discernible landmarks in the vast marine expanses remains a challenge, resulting in consecutive scenes that often lack matching f… ▽ More Maritime radars are prevalently adopted to capture the vessel's omnidirectional data as imagery. Nevertheless, inherent challenges persist with marine radars, including limited frequency, suboptimal resolution, and indeterminate detections. Additionally, the scarcity of discernible landmarks in the vast marine expanses remains a challenge, resulting in consecutive scenes that often lack matching feature points. In this context, we introduce a resilient maritime radar scan representation LodeStar, and an enhanced feature extraction technique tailored for marine radar applications. Moreover, we embark on estimating marine radar odometry utilizing a semi-direct approach. LodeStar-based approach markedly attenuates the errors in odometry estimation, and our assertion is corroborated through meticulous experimental validation. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: IEEE Robotics and Automation Letter

Journal ref: IEEE Robotics and Automation Letter, 9-2 (2024) 1684-1691

arXiv:2402.12298 [pdf, other]

Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports

Authors: Felix J. Dorfner, Liv Jürgensen, Leonhard Donle, Fares Al Mohamad, Tobias R. Bodenmann, Mason C. Cleveland, Felix Busch, Lisa C. Adams, James Sato, Thomas Schultz, Albert E. Kim, Jameson Merkow, Keno K. Bressem, Christopher P. Bridge

Abstract: Introduction: With the rapid advances in large language models (LLMs), there have been numerous new open source as well as commercial models. While recent publications have explored GPT-4 in its application to extracting information of interest from radiology reports, there has not been a real-world comparison of GPT-4 to different leading open-source models. Materials and Methods: Two different… ▽ More Introduction: With the rapid advances in large language models (LLMs), there have been numerous new open source as well as commercial models. While recent publications have explored GPT-4 in its application to extracting information of interest from radiology reports, there has not been a real-world comparison of GPT-4 to different leading open-source models. Materials and Methods: Two different and independent datasets were used. The first dataset consists of 540 chest x-ray reports that were created at the Massachusetts General Hospital between July 2019 and July 2021. The second dataset consists of 500 chest x-ray reports from the ImaGenome dataset. We then compared the commercial models GPT-3.5 Turbo and GPT-4 from OpenAI to the open-source models Mistral-7B, Mixtral-8x7B, Llama2-13B, Llama2-70B, QWEN1.5-72B and CheXbert and CheXpert-labeler in their ability to accurately label the presence of multiple findings in x-ray text reports using different prompting techniques. Results: On the ImaGenome dataset, the best performing open-source model was Llama2-70B with micro F1-scores of 0.972 and 0.970 for zero- and few-shot prompts, respectively. GPT-4 achieved micro F1-scores of 0.975 and 0.984, respectively. On the institutional dataset, the best performing open-source model was QWEN1.5-72B with micro F1-scores of 0.952 and 0.965 for zero- and few-shot prompting, respectively. GPT-4 achieved micro F1-scores of 0.975 and 0.973, respectively. Conclusion: In this paper, we show that while GPT-4 is superior to open-source models in zero-shot report labeling, the implementation of few-shot prompting can bring open-source models on par with GPT-4. This shows that open-source models could be a performant and privacy preserving alternative to GPT-4 for the task of radiology report classification. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.08810 [pdf, other]

Identifying HI Emission and UV Absorber Associations Near the Magellanic Stream

Authors: Doyeon A. Kim, Yong Zheng, Mary E. Putman

Abstract: We present a new technique to identify associations of HI emission in the Magellanic Stream (MS) and ultraviolet (UV) absorbers from 92 QSO sight lines near the MS. We quantify the level of associations of individual HI elements to the main HI body of the Stream using Wasserstein distance-based models, and derive characteristic spatial and kinematic distances of the HI emission in the MS. With the… ▽ More We present a new technique to identify associations of HI emission in the Magellanic Stream (MS) and ultraviolet (UV) absorbers from 92 QSO sight lines near the MS. We quantify the level of associations of individual HI elements to the main HI body of the Stream using Wasserstein distance-based models, and derive characteristic spatial and kinematic distances of the HI emission in the MS. With the emission-based model, we further develop a comparison metric, which identifies the dominant associations of individual UV absorbers with respective to the MS and nearby galaxies. For ionized gas associated with the MS probed by CII, CIV, SiII, SiIII, SiIV, we find that the ion column densities are generally $\sim$0.5 dex higher than those that are not associated, and that the gas is more ionized toward the tail of the MS as indicated by the spatial trend of the CII/CIV ratios. For nearby galaxies, we identify potential new absorbers associated with the CGM of M33 and NGC300, and affirm the associations of absorbers with IC1613 and WLM. For M31, we find the previously identified gradient in column densities as a function of impact parameter, and that absorbers with higher column densities beyond M31's virial radius are more likely to be associated with the MS. Our analysis of absorbers associated with the Magellanic Clouds reveals the presence of continuous and blended diffuse ionized gas between the Stream and the Clouds. Our technique can be applied to future applications of identifying associations within physically complex gaseous structures. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: 54 pages, 17 figures, accepted to ApJ

arXiv:2401.13181 [pdf, other]

doi 10.5281/zenodo.10431797

doi 10.5281/zenodo.10256731

The EDGE-CALIFA Survey: An Extragalactic Database for Galaxy Evolution Studies

Authors: Tony Wong, Yixian Cao, Yufeng Luo, Alberto D. Bolatto, Sebastián F. Sánchez, Jorge K. Barrera-Ballesteros, Leo Blitz, Dario Colombo, Helmut Dannerbauer, Alex Green, Veselina Kalinova, Ferzem Khan, Andrew Kim, Eduardo A. D. Lacerda, Adam K. Leroy, Rebecca C. Levy, Xincheng Lin, Yuanze Luo, Erik W. Rosolowsky, Mónica Rubio, Peter Teuben, Dyas Utomo, Vicente Villanueva, Stuart N. Vogel, Xinyu Wang

Abstract: The EDGE-CALIFA survey provides spatially resolved optical integral field unit (IFU) and CO spectroscopy for 125 galaxies selected from the CALIFA Data Release 3 sample. The Extragalactic Database for Galaxy Evolution (EDGE) presents the spatially resolved products of the survey as pixel tables that reduce the oversampling in the original images and facilitate comparison of pixels from different i… ▽ More The EDGE-CALIFA survey provides spatially resolved optical integral field unit (IFU) and CO spectroscopy for 125 galaxies selected from the CALIFA Data Release 3 sample. The Extragalactic Database for Galaxy Evolution (EDGE) presents the spatially resolved products of the survey as pixel tables that reduce the oversampling in the original images and facilitate comparison of pixels from different images. By joining these pixel tables to lower dimensional tables that provide radial profiles, integrated spectra, or global properties, it is possible to investigate the dependence of local conditions on large-scale properties. The database is freely accessible and has been utilized in several publications. We illustrate the use of this database and highlight the effects of CO upper limits on the inferred slopes of the local scaling relations between stellar mass, star formation rate (SFR), and H$_2$ surface densities. We find that the correlation between H$_2$ and SFR surface density is the tightest among the three relations. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 21 pages, accepted for publication in ApJS, see DOIs below for code and data access

arXiv:2401.04575 [pdf, other]

Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding

Authors: Yatong Bai, Utsav Garg, Apaar Shanker, Haoming Zhang, Samyak Parajuli, Erhan Bas, Isidora Filipovic, Amelia N. Chu, Eugenia D Fomitcheva, Elliot Branson, Aerin Kim, Somayeh Sojoudi, Kyunghyun Cho

Abstract: Vision and vision-language applications of neural networks, such as image classification and captioning, rely on large-scale annotated datasets that require non-trivial data-collecting processes. This time-consuming endeavor hinders the emergence of large-scale datasets, limiting researchers and practitioners to a small number of choices. Therefore, we seek more efficient ways to collect and annot… ▽ More Vision and vision-language applications of neural networks, such as image classification and captioning, rely on large-scale annotated datasets that require non-trivial data-collecting processes. This time-consuming endeavor hinders the emergence of large-scale datasets, limiting researchers and practitioners to a small number of choices. Therefore, we seek more efficient ways to collect and annotate images. Previous initiatives have gathered captions from HTML alt-texts and crawled social media postings, but these data sources suffer from noise, sparsity, or subjectivity. For this reason, we turn to commercial shopping websites whose data meet three criteria: cleanliness, informativeness, and fluency. We introduce the Let's Go Shopping (LGS) dataset, a large-scale public dataset with 15 million image-caption pairs from publicly available e-commerce websites. When compared with existing general-domain datasets, the LGS images focus on the foreground object and have less complex backgrounds. Our experiments on LGS show that the classifiers trained on existing benchmark datasets do not readily generalize to e-commerce data, while specific self-supervised visual feature extractors can better generalize. Furthermore, LGS's high-quality e-commerce-focused images and bimodal nature make it advantageous for vision-language bi-modal tasks: LGS enables image-captioning models to generate richer captions and helps text-to-image generation models achieve e-commerce style transfer. △ Less

Submitted 5 March, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

arXiv:2401.02929 [pdf, other]

The Dark Energy Survey: Cosmology Results With ~1500 New High-redshift Type Ia Supernovae Using The Full 5-year Dataset

Authors: DES Collaboration, T. M. C. Abbott, M. Acevedo, M. Aguena, A. Alarcon, S. Allam, O. Alves, A. Amon, F. Andrade-Oliveira, J. Annis, P. Armstrong, J. Asorey, S. Avila, D. Bacon, B. A. Bassett, K. Bechtol, P. H. Bernardinelli, G. M. Bernstein, E. Bertin, J. Blazek, S. Bocquet, D. Brooks, D. Brout, E. Buckley-Geer, D. L. Burke , et al. (134 additional authors not shown)

Abstract: We present cosmological constraints from the sample of Type Ia supernovae (SN Ia) discovered during the full five years of the Dark Energy Survey (DES) Supernova Program. In contrast to most previous cosmological samples, in which SN are classified based on their spectra, we classify the DES SNe using a machine learning algorithm applied to their light curves in four photometric bands. Spectroscop… ▽ More We present cosmological constraints from the sample of Type Ia supernovae (SN Ia) discovered during the full five years of the Dark Energy Survey (DES) Supernova Program. In contrast to most previous cosmological samples, in which SN are classified based on their spectra, we classify the DES SNe using a machine learning algorithm applied to their light curves in four photometric bands. Spectroscopic redshifts are acquired from a dedicated follow-up survey of the host galaxies. After accounting for the likelihood of each SN being a SN Ia, we find 1635 DES SNe in the redshift range $0.10<z<1.13$ that pass quality selection criteria sufficient to constrain cosmological parameters. This quintuples the number of high-quality $z>0.5$ SNe compared to the previous leading compilation of Pantheon+, and results in the tightest cosmological constraints achieved by any SN data set to date. To derive cosmological constraints we combine the DES supernova data with a high-quality external low-redshift sample consisting of 194 SNe Ia spanning $0.025<z<0.10$. Using SN data alone and including systematic uncertainties we find $Ω_{\rm M}=0.352\pm 0.017$ in flat $Λ$CDM. Supernova data alone now require acceleration ($q_0<0$ in $Λ$CDM) with over $5σ$ confidence. We find $(Ω_{\rm M},w)=(0.264^{+0.074}_{-0.096},-0.80^{+0.14}_{-0.16})$ in flat $w$CDM. For flat $w_0w_a$CDM, we find $(Ω_{\rm M},w_0,w_a)=(0.495^{+0.033}_{-0.043},-0.36^{+0.36}_{-0.30},-8.8^{+3.7}_{-4.5})$. Including Planck CMB data, SDSS BAO data, and DES $3\times2$-point data gives $(Ω_{\rm M},w)=(0.321\pm0.007,-0.941\pm0.026)$. In all cases dark energy is consistent with a cosmological constant to within $\sim2σ$. In our analysis, systematic errors on cosmological parameters are subdominant compared to statistical errors; paving the way for future photometrically classified supernova analyses. △ Less

Submitted 6 June, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

Comments: 22 pages, 12 figures; Accepted by ApJL 29 March 2024; v3 updates to accepted version and includes links to data

Report number: FERMILAB-PUB-23-0821-PPD

arXiv:2312.17487 [pdf, other]

LiDAR Odometry Survey: Recent Advancements and Remaining Challenges

Authors: Dongjae Lee, Minwoo Jung, Wooseong Yang, Ayoung Kim

Abstract: Odometry is crucial for robot navigation, particularly in situations where global positioning methods like global positioning system (GPS) are unavailable. The main goal of odometry is to predict the robot's motion and accurately determine its current location. Various sensors, such as wheel encoder, inertial measurement unit (IMU), camera, radar, and Light Detection and Ranging (LiDAR), are used… ▽ More Odometry is crucial for robot navigation, particularly in situations where global positioning methods like global positioning system (GPS) are unavailable. The main goal of odometry is to predict the robot's motion and accurately determine its current location. Various sensors, such as wheel encoder, inertial measurement unit (IMU), camera, radar, and Light Detection and Ranging (LiDAR), are used for odometry in robotics. LiDAR, in particular, has gained attention for its ability to provide rich three-dimensional (3D) data and immunity to light variations. This survey aims to examine advancements in LiDAR odometry thoroughly. We start by exploring LiDAR technology and then scrutinize LiDAR odometry works, categorizing them based on their sensor integration approaches. These approaches include methods relying solely on LiDAR, those combining LiDAR with IMU, strategies involving multiple LiDARs, and methods fusing LiDAR with other sensor modalities. In conclusion, we address existing challenges and outline potential future directions in LiDAR odometry. Additionally, we analyze public datasets and evaluation methods for LiDAR odometry. To our knowledge, this survey is the first comprehensive exploration of LiDAR odometry. △ Less

Submitted 29 December, 2023; originally announced December 2023.

Comments: 32 pages, 5 figures

arXiv:2312.09164 [pdf, other]

Properties of 3D HI Filaments in the Smith High Velocity Cloud

Authors: Colin Holm-Hansen, M. E. Putman, D. A. Kim

Abstract: We present findings of 3D filamentary structures in the Smith Cloud, a high-velocity cloud (HVC) located at $l=38^{\circ}$, $b=-13^{\circ}$. We use data from the Galactic Arecibo L-Band Feed Array \ion{H}{i} (GALFA-\ion{H}{i}) along with our new filament detection algorithm, \texttt{fil3d}, to characterize these structures. In this paper, we also discuss how different input parameters affect the o… ▽ More We present findings of 3D filamentary structures in the Smith Cloud, a high-velocity cloud (HVC) located at $l=38^{\circ}$, $b=-13^{\circ}$. We use data from the Galactic Arecibo L-Band Feed Array \ion{H}{i} (GALFA-\ion{H}{i}) along with our new filament detection algorithm, \texttt{fil3d}, to characterize these structures. In this paper, we also discuss how different input parameters affect the output of \texttt{fil3d}. We study filaments in the local ISM and compare them to those found in the Smith Cloud. Based on thermal linewidth estimations we find supporting evidence that the Smith Cloud filaments are part of its warm neutral medium. We also find a relationship between thermal linewidth and the $v_{LSR}$ of the filaments. We study the plane-of-sky magnetic field as traced by Planck 353 GHz polarized dust emission along the line of sight and find the HI filaments in this region are not aligned with the magnetic field. This is likely related to their location close to dynamic processes in the Galactic Plane and/or the low column density of the filaments relative to emission in the Plane. The results show the HI filaments are found in a wide range of Galactic environments and form through multiple processes. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.03195 [pdf, other]

doi 10.18653/v1/2022.socialnlp-1.3

Detecting Rumor Veracity with Only Textual Information by Double-Channel Structure

Authors: Alex Kim, Sangwon Yoon

Abstract: Kyle (1985) proposes two types of rumors: informed rumors which are based on some private information and uninformed rumors which are not based on any information (i.e. bluffing). Also, prior studies find that when people have credible source of information, they are likely to use a more confident textual tone in their spreading of rumors. Motivated by these theoretical findings, we propose a doub… ▽ More Kyle (1985) proposes two types of rumors: informed rumors which are based on some private information and uninformed rumors which are not based on any information (i.e. bluffing). Also, prior studies find that when people have credible source of information, they are likely to use a more confident textual tone in their spreading of rumors. Motivated by these theoretical findings, we propose a double-channel structure to determine the ex-ante veracity of rumors on social media. Our ultimate goal is to classify each rumor into true, false, or unverifiable category. We first assign each text into either certain (informed rumor) or uncertain (uninformed rumor) category. Then, we apply lie detection algorithm to informed rumors and thread-reply agreement detection algorithm to uninformed rumors. Using the dataset of SemEval 2019 Task 7, which requires ex-ante threefold classification (true, false, or unverifiable) of social media rumors, our model yields a macro-F1 score of 0.4027, outperforming all the baseline models and the second-place winner (Gorrell et al., 2019). Furthermore, we empirically validate that the double-channel structure outperforms single-channel structures which use either lie detection or agreement detection algorithm to all posts. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Journal ref: Proceedings of the Tenth International Workshop on Natural Language Processing for Social Media, 2022, 35--44,

arXiv:2312.03194 [pdf, other]

doi 10.18653/v1/2021.econlp-1.4

Corporate Bankruptcy Prediction with Domain-Adapted BERT

Authors: Alex Kim, Sangwon Yoon

Abstract: This study performs BERT-based analysis, which is a representative contextualized language model, on corporate disclosure data to predict impending bankruptcies. Prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables. However, in our study, we focus on improving the quality of input dataset. Specifically, we emplo… ▽ More This study performs BERT-based analysis, which is a representative contextualized language model, on corporate disclosure data to predict impending bankruptcies. Prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables. However, in our study, we focus on improving the quality of input dataset. Specifically, we employ BERT model to perform sentiment analysis on MD&A disclosures. We show that BERT outperforms dictionary-based predictions and Word2Vec-based predictions in terms of adjusted R-square in logistic regression, k-nearest neighbor (kNN-5), and linear kernel support vector machine (SVM). Further, instead of pre-training the BERT model from scratch, we apply self-learning with confidence-based filtering to corporate disclosure data (10-K). We achieve the accuracy rate of 91.56% and demonstrate that the domain adaptation procedure brings a significant improvement in prediction accuracy. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Journal ref: Proceedings of the Third Workshop on Economics and Natural Language Processing, 2021, 26--36

arXiv:2311.12098 [pdf, other]

Union Through UNITY: Cosmology with 2,000 SNe Using a Unified Bayesian Framework

Authors: David Rubin, Greg Aldering, Marc Betoule, Andy Fruchter, Xiaosheng Huang, Alex G. Kim, Chris Lidman, Eric Linder, Saul Perlmutter, Pilar Ruiz-Lapuente, Nao Suzuki

Abstract: Type Ia supernovae (SNe Ia) were instrumental in establishing the acceleration of the universe's expansion. By virtue of their combination of distance reach, precision, and prevalence, they continue to provide key cosmological constraints, complementing other cosmological probes. Individual SN surveys cover only over about a factor of two in redshift, so compilations of multiple SN datasets are st… ▽ More Type Ia supernovae (SNe Ia) were instrumental in establishing the acceleration of the universe's expansion. By virtue of their combination of distance reach, precision, and prevalence, they continue to provide key cosmological constraints, complementing other cosmological probes. Individual SN surveys cover only over about a factor of two in redshift, so compilations of multiple SN datasets are strongly beneficial. We assemble an updated "Union" compilation of 2087 cosmologically useful SNe Ia from 24 datasets ("Union3"). We take care to put all SNe on the same distance scale and update the light-curve fitting with SALT3 to use the full rest-frame optical. Over the next few years, the number of cosmologically useful SNe Ia will increase by more than a factor of ten, and keeping systematic uncertainties subdominant will be more challenging than ever. We discuss the importance of treating outliers, selection effects, light-curve shape and color populations and standardization relations, unexplained dispersion, and heterogeneous observations simultaneously. We present an updated Bayesian framework, called UNITY1.5 (Unified Nonlinear Inference for Type-Ia cosmologY), that incorporates significant improvements in our ability to model selection effects, standardization, and systematic uncertainties compared to earlier analyses. As an analysis byproduct, we also recover the posterior of the SN-only peculiar-velocity field, although we do not interpret it in this work. We compute updated cosmological constraints with Union3 and UNITY1.5, finding weak 1.7--2.6sigma tension with LambdaCDM and possible evidence for thawing dark energy. We release our binned SN distances to the community. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: 58 pages, submitted to ApJ

arXiv:2311.04937 [pdf, other]

Multimodal Clinical Benchmark for Emergency Care (MC-BEC): A Comprehensive Benchmark for Evaluating Foundation Models in Emergency Medicine

Authors: Emma Chen, Aman Kansal, Julie Chen, Boyang Tom Jin, Julia Rachel Reisler, David A Kim, Pranav Rajpurkar

Abstract: We propose the Multimodal Clinical Benchmark for Emergency Care (MC-BEC), a comprehensive benchmark for evaluating foundation models in Emergency Medicine using a dataset of 100K+ continuously monitored Emergency Department visits from 2020-2022. MC-BEC focuses on clinically relevant prediction tasks at timescales from minutes to days, including predicting patient decompensation, disposition, and… ▽ More We propose the Multimodal Clinical Benchmark for Emergency Care (MC-BEC), a comprehensive benchmark for evaluating foundation models in Emergency Medicine using a dataset of 100K+ continuously monitored Emergency Department visits from 2020-2022. MC-BEC focuses on clinically relevant prediction tasks at timescales from minutes to days, including predicting patient decompensation, disposition, and emergency department (ED) revisit, and includes a standardized evaluation framework with train-test splits and evaluation metrics. The multimodal dataset includes a wide range of detailed clinical data, including triage information, prior diagnoses and medications, continuously measured vital signs, electrocardiogram and photoplethysmograph waveforms, orders placed and medications administered throughout the visit, free-text reports of imaging studies, and information on ED diagnosis, disposition, and subsequent revisits. We provide performance baselines for each prediction task to enable the evaluation of multimodal, multitask models. We believe that MC-BEC will encourage researchers to develop more effective, generalizable, and accessible foundation models for multimodal clinical data. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track

arXiv:2311.04020 [pdf, other]

Analyzing Film Adaptation through Narrative Alignment

Authors: Tanzir Pial, Shahreen Salim, Charuta Pethe, Allen Kim, Steven Skiena

Abstract: Novels are often adapted into feature films, but the differences between the two media usually require dropping sections of the source text from the movie script. Here we study this screen adaptation process by constructing narrative alignments using the Smith-Waterman local alignment algorithm coupled with SBERT embedding distance to quantify text similarity between scenes and book units. We use… ▽ More Novels are often adapted into feature films, but the differences between the two media usually require dropping sections of the source text from the movie script. Here we study this screen adaptation process by constructing narrative alignments using the Smith-Waterman local alignment algorithm coupled with SBERT embedding distance to quantify text similarity between scenes and book units. We use these alignments to perform an automated analysis of 40 adaptations, revealing insights into the screenwriting process concerning (i) faithfulness of adaptation, (ii) importance of dialog, (iii) preservation of narrative order, and (iv) gender representation issues reflective of the Bechdel test. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: 20 pages, 5 figures, 10 tables

arXiv:2311.03614 [pdf, other]

STONYBOOK: A System and Resource for Large-Scale Analysis of Novels

Authors: Charuta Pethe, Allen Kim, Rajesh Prabhakar, Tanzir Pial, Steven Skiena

Abstract: Books have historically been the primary mechanism through which narratives are transmitted. We have developed a collection of resources for the large-scale analysis of novels, including: (1) an open source end-to-end NLP analysis pipeline for the annotation of novels into a standard XML format, (2) a collection of 49,207 distinct cleaned and annotated novels, and (3) a database with an associated… ▽ More Books have historically been the primary mechanism through which narratives are transmitted. We have developed a collection of resources for the large-scale analysis of novels, including: (1) an open source end-to-end NLP analysis pipeline for the annotation of novels into a standard XML format, (2) a collection of 49,207 distinct cleaned and annotated novels, and (3) a database with an associated web interface for the large-scale aggregate analysis of these literary works. We describe the major functionalities provided in the annotation system along with their utilities. We present samples of analysis artifacts from our website, such as visualizations of character occurrences and interactions, similar books, representative vocabulary, part of speech statistics, and readability metrics. We also describe the use of the annotated format in qualitative and quantitative analysis across large corpora of novels. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 8 pages, 12 figures

arXiv:2311.01384 [pdf, other]

Modeling of liquid metal droplet deformation by laser impact

Authors: I. Yu. Vichev, D. A. Kim, V. V. Medvedev

Abstract: The method of sequential simulation of liquid metal droplet deformation by a laser pulse is considered. The first stage is the laser impact on a droplet. It was simulated using RALEF-2D code, based on the radiative gas dynamic model. The next stage is target deformation from a droplet to a disk. This part of simulation was carried out using OpenFOAM code where surface tension forces are taken into… ▽ More The method of sequential simulation of liquid metal droplet deformation by a laser pulse is considered. The first stage is the laser impact on a droplet. It was simulated using RALEF-2D code, based on the radiative gas dynamic model. The next stage is target deformation from a droplet to a disk. This part of simulation was carried out using OpenFOAM code where surface tension forces are taken into account. Good agreement with experimental results was obtained. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.19110 [pdf]

Comparative Analysis of Plastid Genomes Using Pangenome Research ToolKit (PGR-TK)

Authors: Richa Jayanti, Andrew Kim, Sean Pham, Athreya Raghavan, Anish Sharma, Manoj P. Samanta

Abstract: Plastid genomes (plastomes) of angiosperms are of great interest among biologists. High-throughput sequencing is making many such genomes accessible, increasing the need for tools to perform rapid comparative analysis. This exploratory analysis investigates whether the Pangenome Research Tool Kit (PGR-TK) is suitable for analyzing plastomes. After determining the optimal parameters for this tool o… ▽ More Plastid genomes (plastomes) of angiosperms are of great interest among biologists. High-throughput sequencing is making many such genomes accessible, increasing the need for tools to perform rapid comparative analysis. This exploratory analysis investigates whether the Pangenome Research Tool Kit (PGR-TK) is suitable for analyzing plastomes. After determining the optimal parameters for this tool on plastomes, we use it to compare sequences from each of the genera - Magnolia, Solanum, Fragaria and Cotoneaster, as well as a combined set from 20 rosid genera. PGR-TK recognizes large-scale plastome structures, such as the inverted repeats, among combined sequences from distant rosid families. If the plastid genomes are rotated to the same starting point, it also correctly groups different species from the same genus together in a generated cladogram. The visual approach of PGR-TK provides insights into genome evolution without requiring gene annotations. △ Less

Submitted 29 October, 2023; originally announced October 2023.

Comments: 15 pages, 4 figures

arXiv:2310.17721 [pdf, other]

From Transcripts to Insights: Uncovering Corporate Risks Using Generative AI

Authors: Alex Kim, Maximilian Muhn, Valeri Nikolaev

Abstract: We explore the value of generative AI tools, such as ChatGPT, in helping investors uncover dimensions of corporate risk. We develop and validate firm-level measures of risk exposure to political, climate, and AI-related risks. Using the GPT 3.5 model to generate risk summaries and assessments from the context provided by earnings call transcripts, we show that GPT-based measures possess significan… ▽ More We explore the value of generative AI tools, such as ChatGPT, in helping investors uncover dimensions of corporate risk. We develop and validate firm-level measures of risk exposure to political, climate, and AI-related risks. Using the GPT 3.5 model to generate risk summaries and assessments from the context provided by earnings call transcripts, we show that GPT-based measures possess significant information content and outperform the existing risk measures in predicting (abnormal) firm-level volatility and firms' choices such as investment and innovation. Importantly, information in risk assessments dominates that in risk summaries, establishing the value of general AI knowledge. We also find that generative AI is effective at detecting emerging risks, such as AI risk, which has soared in recent quarters. Our measures perform well both within and outside the GPT's training window and are priced in equity markets. Taken together, an AI-based approach to risk measurement provides useful insights to users of corporate disclosures at a low cost. △ Less

Submitted 26 October, 2023; originally announced October 2023.

arXiv:2310.05231 [pdf, other]

doi 10.1145/3613904.3642937

MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients' Journaling

Authors: Taewan Kim, Seolyeong Bae, Hyun Ah Kim, Su-woo Lee, Hwajung Hong, Chanmo Yang, Young-Ho Kim

Abstract: In the mental health domain, Large Language Models (LLMs) offer promising new opportunities, though their inherent complexity and low controllability have raised questions about their suitability in clinical settings. We present MindfulDiary, a mobile journaling app incorporating an LLM to help psychiatric patients document daily experiences through conversation. Designed in collaboration with men… ▽ More In the mental health domain, Large Language Models (LLMs) offer promising new opportunities, though their inherent complexity and low controllability have raised questions about their suitability in clinical settings. We present MindfulDiary, a mobile journaling app incorporating an LLM to help psychiatric patients document daily experiences through conversation. Designed in collaboration with mental health professionals (MHPs), MindfulDiary takes a state-based approach to safely comply with the experts' guidelines while carrying on free-form conversations. Through a four-week field study involving 28 patients with major depressive disorder and five psychiatrists, we found that MindfulDiary supported patients in consistently enriching their daily records and helped psychiatrists better empathize with their patients through an understanding of their thoughts and daily contexts. Drawing on these findings, we discuss the implications of leveraging LLMs in the mental health domain, bridging the technical feasibility and their integration into clinical settings. △ Less

Submitted 22 February, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

Comments: 20 pages, 6 figures, 4 tables. Accepted at ACM CHI 2024

ACM Class: H.5.2; I.2.7

Journal ref: In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11-16, 2024, Honolulu, HI, USA. ACM, New York, NY, USA

arXiv:2309.14590 [pdf, other]

HeLiPR: Heterogeneous LiDAR Dataset for inter-LiDAR Place Recognition under Spatiotemporal Variations

Authors: Minwoo Jung, Wooseong Yang, Dongjae Lee, Hyeonjae Gil, Giseop Kim, Ayoung Kim

Abstract: Place recognition is crucial for robot localization and loop closure in simultaneous localization and mapping (SLAM). Light Detection and Ranging (LiDAR), known for its robust sensing capabilities and measurement consistency even in varying illumination conditions, has become pivotal in various fields, surpassing traditional imaging sensors in certain applications. Among various types of LiDAR, sp… ▽ More Place recognition is crucial for robot localization and loop closure in simultaneous localization and mapping (SLAM). Light Detection and Ranging (LiDAR), known for its robust sensing capabilities and measurement consistency even in varying illumination conditions, has become pivotal in various fields, surpassing traditional imaging sensors in certain applications. Among various types of LiDAR, spinning LiDARs are widely used, while non-repetitive scanning patterns have recently been utilized in robotics applications. Some LiDARs provide additional measurements such as reflectivity, Near Infrared (NIR), and velocity from Frequency modulated continuous wave (FMCW) LiDARs. Despite these advances, there is a lack of comprehensive datasets reflecting the broad spectrum of LiDAR configurations for place recognition. To tackle this issue, our paper proposes the HeLiPR dataset, curated especially for place recognition with heterogeneous LiDARs, embodying spatiotemporal variations. To the best of our knowledge, the HeLiPR dataset is the first heterogeneous LiDAR dataset supporting inter-LiDAR place recognition with both non-repetitive and spinning LiDARs, accommodating different field of view (FOV)s and varying numbers of rays. The dataset covers diverse environments, from urban cityscapes to high-dynamic freeways, over a month, enhancing adaptability and robustness across scenarios. Notably, HeLiPR includes trajectories parallel to MulRan sequences, making it valuable for research in heterogeneous LiDAR place recognition and long-term studies. The dataset is accessible at https://sites.google.com/view/heliprdataset. △ Less

Submitted 19 March, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: 11 pages, 9 figures, 5 tables

arXiv:2309.14041 [pdf, other]

Beam Charge Asymmetries for Deeply Virtual Compton Scattering on the Proton at CLAS12

Authors: E. Voutier, V. Burkert, S. Niccolai, R. Paremuzyan, A. Afanasev, J. -S. Alvarado-Galeano, M. Atoui, L. Barion, M. Battaglieri, J. Bernauer, A. Bianconi, M. Bondi, W. Briscoe, A. Camsonne, R. Capobianco, A. Celentano, P. Chatagnon, T. Chetry, G. Ciullo, P. Cole, M. Contalbrigo, G. Costantini, M. Defurne, A. Deur, R. De Vita , et al. (54 additional authors not shown)

Abstract: The parameterization of the nucleon structure through Generalized Parton Distributions (GPDs) shed a new light on the nucleon internal dynamics. For its direct interpretation, Deeply Virtual Compton Scattering (DVCS) is the golden channel for GPDs investigation. The DVCS process interferes with the Bethe-Heitler (BH) mechanism to constitute the leading order amplitude of the $eN \to eNγ$ process.… ▽ More The parameterization of the nucleon structure through Generalized Parton Distributions (GPDs) shed a new light on the nucleon internal dynamics. For its direct interpretation, Deeply Virtual Compton Scattering (DVCS) is the golden channel for GPDs investigation. The DVCS process interferes with the Bethe-Heitler (BH) mechanism to constitute the leading order amplitude of the $eN \to eNγ$ process. The study of the $epγ$ reaction with polarized positron and electron beams gives a complete set of unique observables to unravel the different contributions to the $ep γ$ cross section. This separates the different reaction amplitudes, providing a direct access to their real and imaginary parts which procures crucial constraints on the model dependences and associated systematic uncertainties on GPDs extraction. The real part of the BH-DVCS interference amplitude is particularly sensitive to the $D$-term which parameterizes the Gravitational Form Factors of the nucleon. The separation of the imaginary parts of the interference and DVCS amplitudes provides insights on possible higher-twist effects. We propose to measure the unpolarized and polarized Beam Charge Asymmetries (BCAs) of the $\vec{e}^{\pm}p \to e^{\pm}p γ$ process on an unpolarized hydrogen target with {\tt CLAS12}, using polarized positron and electron beams at 10.6 GeV. The azimuthal and $t$-dependences of the unpolarized and polarized BCAs will be measured over a large $(x_B,Q^2)$ phase space using a 100 day run with a luminosity of 0.66$\times 10^{35}$cm$^{-2}\cdot$s$^{-1}$. △ Less

Submitted 13 November, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: Proposal to the Jefferson Lab Program Advisory Committee (PAC51)

arXiv:2309.10777 [pdf, other]

doi 10.1093/mnras/stad2792

The Kinematic Structure of Magnetically Aligned HI Filaments

Authors: Doyeon Avery Kim, Susan E Clark, Mary E Putman, Larry Li

Abstract: We characterize the kinematic and magnetic properties of HI filaments located in a high Galactic latitude region ($165^\circ < α< 195^\circ$ and $12^\circ < δ< 24^\circ$). We extract three-dimensional filamentary structures using \texttt{fil3d} from the Galactic Arecibo L-Band Feed Array HI (GALFA-HI) survey 21-cm emission data. Our algorithm identifies coherent emission structures in neighboring… ▽ More We characterize the kinematic and magnetic properties of HI filaments located in a high Galactic latitude region ($165^\circ < α< 195^\circ$ and $12^\circ < δ< 24^\circ$). We extract three-dimensional filamentary structures using \texttt{fil3d} from the Galactic Arecibo L-Band Feed Array HI (GALFA-HI) survey 21-cm emission data. Our algorithm identifies coherent emission structures in neighboring velocity channels. Based on the mean velocity, we identify a population of local and intermediate velocity cloud (IVC) filaments. We find the orientations of the local (but not the IVC) HI filaments are aligned with the magnetic field orientations inferred from Planck 353 GHz polarized dust emission. We analyze position-velocity diagrams of the velocity-coherent filaments, and find that only 15 percent of filaments demonstrate significant major-axis velocity gradients with a median magnitude of 0.5 km s$^{-1}$ pc$^{-1}$, assuming a fiducial filament distance of 100 pc. We conclude that the typical diffuse HI filament does not exhibit a simple velocity gradient. The reported filament properties constrain future theoretical models of filament formation. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 15 pages, 16 figures

Journal ref: Monthly Notices of the Royal Astronomical Society, 2023

Showing 1–50 of 572 results for author: Kim, A