Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 401 results for author: Song, K

.
  1. arXiv:2408.00111  [pdf, other

    physics.optics cond-mat.mes-hall

    Femtosecond switching of strong light-matter interactions in microcavities with two-dimensional semiconductors

    Authors: Armando Genco, Charalambos Louca, Cristina Cruciano, Kok Wee Song, Chiara Trovatello, Giuseppe Di Blasio, Giacomo Sansone, Sam Randerson, Peter Claronino, Rahul Jayaprakash, Kenji Watanabe, Takashi Taniguchi, David G. Lidzey, Oleksandr Kyriienko, Stefano Dal Conte, Alexander I. Tartakovskii, Giulio Cerullo

    Abstract: Ultrafast all-optical logic devices based on nonlinear light-matter interactions hold the promise to overcome the speed limitations of conventional electronic devices. Strong coupling of excitons and photons inside an optical resonator enhances such interactions and generates new polariton states which give access to unique nonlinear phenomena, such as Bose-Einstein condensation, used for all-opti… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  2. arXiv:2407.20502  [pdf, other

    cs.CV

    Restoring Real-World Degraded Events Improves Deblurring Quality

    Authors: Yeqing Shen, Shang Li, Kun Song

    Abstract: Due to its high speed and low latency, DVS is frequently employed in motion deblurring. Ideally, high-quality events would adeptly capture intricate motion information. However, real-world events are generally degraded, thereby introducing significant artifacts into the deblurred results. In response to this challenge, we model the degradation of events and propose RDNet to improve the quality of… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  3. arXiv:2407.17491  [pdf, other

    cs.CV cs.LG

    Robust Adaptation of Foundation Models with Black-Box Visual Prompting

    Authors: Changdae Oh, Gyeongdeok Seo, Geunyoung Jung, Zhi-Qi Cheng, Hosik Choi, Jiyoung Jung, Kyungwoo Song

    Abstract: With the surge of large-scale pre-trained models (PTMs), adapting these models to numerous downstream tasks becomes a crucial problem. Consequently, parameter-efficient transfer learning (PETL) of large models has grasped huge attention. While PETL methods show impressive performance, they commonly rely on two optimistic assumptions: 1) the entire parameters of a PTM are available, and 2) a suffic… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Extended work from the CVPR'23 paper: arxiv:2303.14773; This paper has been submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) for possible publication

  4. arXiv:2407.02867  [pdf, other

    cs.MM cs.CL

    Contrast then Memorize: Semantic Neighbor Retrieval-Enhanced Inductive Multimodal Knowledge Graph Completion

    Authors: Yu Zhao, Ying Zhang, Baohang Zhou, Xinying Qian, Kehui Song, Xiangrui Cai

    Abstract: A large number of studies have emerged for Multimodal Knowledge Graph Completion (MKGC) to predict the missing links in MKGs. However, fewer studies have been proposed to study the inductive MKGC (IMKGC) involving emerging entities unseen during training. Existing inductive approaches focus on learning textual entity representations, which neglect rich semantic information in visual modality. More… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted by SIGIR 2024

  5. arXiv:2407.01853  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Multilingual Instruction Finetuning via Linguistically Natural and Diverse Datasets

    Authors: Sathish Reddy Indurthi, Wenxuan Zhou, Shamil Chollampatt, Ravi Agrawal, Kaiqiang Song, Lingxiao Zhao, Chenguang Zhu

    Abstract: Advancements in Large Language Models (LLMs) have significantly enhanced instruction-following capabilities. However, most Instruction Fine-Tuning (IFT) datasets are predominantly in English, limiting model performance in other languages. Traditional methods for creating multilingual IFT datasets such as translating existing English IFT datasets or converting existing NLP datasets into IFT dataset… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  6. arXiv:2407.01145  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Machine Learning-Assisted 3D Printing of Thermoelectric Materials of Ultrahigh Performances at Room Temperature

    Authors: Kaidong Song, Guoyue Xu, A. N. M. Tanvir, Ke Wang, Md Omarsany Bappy, Haijian Yang, Wenjie Shang, Le Zhou, Alexander Dowling, Tengei Luo, Yanliang Zhang

    Abstract: Thermoelectric energy conversion is an attractive technology for generating electricity from waste heat and using electricity for solid-state cooling. However, conventional manufacturing processes for thermoelectric devices are costly and limited to simple device geometries. This work reports an extrusion printing method to fabricate high-performance thermoelectric materials with complex 3D archit… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  7. arXiv:2407.00914  [pdf, ps, other

    math.DS

    Multifractal analysis of the convergence exponents for the digits in $d$-decaying Gauss like dynamical systems

    Authors: Kunkun Song, Mengjie Zhang

    Abstract: Let $\{a_n(x)\}_{n\geq1}$ be the sequence of digits of $x\in(0,1)$ in infinite iterated function systems with polynomial decay of the derivative. We first study the multifractal spectrum of the convergence exponent defined by the sequence of the digits $\{a_n(x)\}_{n\geq1}$ and the weighted products of distinct digits with finite numbers respectively, and then calculate the Hausdorff dimensions of… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 17 pages

    MSC Class: 11K55; 28A80

  8. arXiv:2406.18157  [pdf

    physics.app-ph

    Photosensitive PEEK Ink Enables Digital Light Processing 3D Printed High-performance Small Architected-Plastics

    Authors: Ze Zhang, Kewei Song, Rongyi Zhuang, Jianxian He, Yi Yang, Yifan Pan, Takeshi Mino, Kayo Hirose, Shinjiro Umezu

    Abstract: Polyetheretherketone (PEEK), as a semi-crystalline high-performance engineering plastic, has demonstrated good application prospects since its introduction. The ability of PEEK to be fabricated in complex architecture is a major limitation due to the inherent shortcomings of material extrusion 3D printing technology in terms of low resolution, low surface quality, and interlayer bonding. We propos… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  9. arXiv:2406.17862  [pdf, other

    cs.LO

    ESBMC v7.6: Enhanced Model Checking of C++ Programs with Clang AST

    Authors: Xianzhiyu Li, Kunjian Song, Mikhail R. Gadelha, Franz Brauße, Rafael S. Menezes, Konstantin Korovin, Lucas C. Cordeiro

    Abstract: This paper presents Efficient SMT-Based Context-Bounded Model Checker (ESBMC) v7.6, an extended version based on previous work on ESBMC v7.3 by K. Song et al. The v7.3 introduced a new Clang-based C++ front-end to address the challenges posed by modern C++ programs. Although the new front-end has demonstrated significant potential in previous studies, it remains in the developmental stage and lack… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 27 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:2308.05649

  10. arXiv:2406.15664  [pdf, other

    stat.ML cs.LG

    Flat Posterior Does Matter For Bayesian Transfer Learning

    Authors: Sungjun Lim, Jeyoon Yeom, Sooyon Kim, Hoyoon Byun, Jinho Kang, Yohan Jung, Jiyoung Jung, Kyungwoo Song

    Abstract: The large-scale pre-trained neural network has achieved notable success in enhancing performance for downstream tasks. Another promising approach for generalization is Bayesian Neural Network (BNN), which integrates Bayesian methods into neural network architectures, offering advantages such as Bayesian Model averaging (BMA) and uncertainty quantification. Despite these benefits, transfer learning… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  11. arXiv:2406.14228  [pdf, other

    cs.AI

    EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms

    Authors: Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Dongsheng Li, Deqing Yang

    Abstract: The rise of powerful large language models (LLMs) has spurred a new trend in building LLM-based autonomous agents for solving complex tasks, especially multi-agent systems. Despite the remarkable progress, we notice that existing works are heavily dependent on human-designed frameworks, which greatly limits the functional scope and scalability of agent systems. How to automatically extend the spec… ▽ More

    Submitted 11 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: Work in process

  12. arXiv:2406.12084  [pdf, other

    cs.CL cs.AI

    When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives

    Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Wenlin Yao, Hassan Foroosh, Dong Yu, Fei Liu

    Abstract: Reasoning is most powerful when an LLM accurately aggregates relevant information. We examine the critical role of information aggregation in reasoning by requiring the LLM to analyze sports narratives. To succeed at this task, an LLM must infer points from actions, identify related entities, attribute points accurately to players and teams, and compile key statistics to draw conclusions. We condu… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  13. arXiv:2406.11827  [pdf, other

    cs.CL cs.AI cs.LG

    WPO: Enhancing RLHF with Weighted Preference Optimization

    Authors: Wenxuan Zhou, Ravi Agrawal, Shujian Zhang, Sathish Reddy Indurthi, Sanqiang Zhao, Kaiqiang Song, Silei Xu, Chenguang Zhu

    Abstract: Reinforcement learning from human feedback (RLHF) is a promising solution to align large language models (LLMs) more closely with human values. Off-policy preference optimization, where the preference data is obtained from other models, is widely adopted due to its cost efficiency and scalability. However, off-policy preference optimization often suffers from a distributional gap between the polic… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  14. arXiv:2406.11602  [pdf, other

    astro-ph.SR

    Association between a Failed Prominence Eruption and the Drainage of Mass from Another Prominence

    Authors: Jianchao Xue, Li Feng, Hui Li, Ping Zhang, Jun Chen, Guanglu Shi, Kaifan Ji, Ye Qiu, Chuan Li, Lei Lu, Beili Ying, Ying Li, Yu Huang, Youping Li, Jingwei Li, Jie Zhao, Dechao Song, Shuting Li, Zhengyuan Tian, Yingna Su, Qingmin Zhang, Yunyi Ge, Jiahui Shan, Qiao Li, Gen Li , et al. (9 additional authors not shown)

    Abstract: Sympathetic eruptions of solar prominences have been studied for decades, however, it is usually difficult to identify their causal links. Here we present two failed prominence eruptions on 26 October 2022 and explore their connections. Using stereoscopic observations, the south prominence (PRO-S) erupts with untwisting motions, flare ribbons occur underneath, and new connections are formed during… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 15 pages, 7 figures, has been accepted by Solar Physics

  15. arXiv:2406.09716  [pdf, ps, other

    cs.CR cs.AI cs.DC cs.LG

    Speed-up of Data Analysis with Kernel Trick in Encrypted Domain

    Authors: Joon Soo Yoo, Baek Kyung Song, Tae Min Ahn, Ji Won Heo, Ji Won Yoon

    Abstract: Homomorphic encryption (HE) is pivotal for secure computation on encrypted data, crucial in privacy-preserving data analysis. However, efficiently processing high-dimensional data in HE, especially for machine learning and statistical (ML/STAT) algorithms, poses a challenge. In this paper, we present an effective acceleration method using the kernel method for HE schemes, enhancing time performanc… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Submitted as a preprint

  16. arXiv:2406.09081  [pdf, ps, other

    math.NT math.DS

    Multifractal analysis of the growth rate of digits in Schneider's $p$-adic continued fraction dynamical system

    Authors: Kunkun Song, Wanlou Wu, Yueli Yu, Sainan Zeng

    Abstract: Let $\mathbb{Z}_p$ be the ring of $p$-adic integers and $a_n(x)$ be the $n$-th digit of Schneider's $p$-adic continued fraction of $x\in p\mathbb{Z}_p$. We study the growth rate of the digits $\{a_n(x)\}_{n\geq1}$ from the viewpoint of multifractal analysis. The Hausdorff dimension of the set \[E_{\sup}(ψ)=\Big\{x\in p\mathbb{Z}_p:\ \limsup\limits_{n\to\infty}\frac{a_n(x)}{ψ(n)}=1\Big\}\] is compl… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  17. arXiv:2406.08263  [pdf, other

    cond-mat.mes-hall

    Electrically tunable and enhanced nonlinearity of moiré exciton-polaritons in transition metal dichalcogenide bilayers

    Authors: Kok Wee Song, Oleksandr Kyriienko

    Abstract: We develop a microscopic theory for nonlinear optical response of moiré exciton-polaritons in bilayers of transition metal dichalcogenides (TMDs). Our theory allows to study the tunnel-coupled intralayer and interlayer excitonic modes for a wide range of twist angles ($θ$), external electric field, and light-matter coupling, providing insights into the hybridization regime inaccessible before. Spe… ▽ More

    Submitted 21 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: main text: 8 pages and 3 figures, supplemental material: 5 pages and 1 figure

  18. arXiv:2406.07572  [pdf, ps, other

    cs.AI cs.CE cs.LG

    Domain-specific ReAct for physics-integrated iterative modeling: A case study of LLM agents for gas path analysis of gas turbines

    Authors: Tao Song, Yuwei Fan, Chenlong Feng, Keyu Song, Chao Liu, Dongxiang Jiang

    Abstract: This study explores the application of large language models (LLMs) with callable tools in energy and power engineering domain, focusing on gas path analysis of gas turbines. We developed a dual-agent tool-calling process to integrate expert knowledge, predefined tools, and LLM reasoning. We evaluated various LLMs, including LLama3, Qwen1.5 and GPT. Smaller models struggled with tool usage and par… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  19. arXiv:2406.07471  [pdf, other

    cs.CV

    OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding

    Authors: Ming Hu, Peng Xia, Lin Wang, Siyuan Yan, Feilong Tang, Zhongxing Xu, Yimin Luo, Kaimin Song, Jurgen Leitner, Xuelian Cheng, Jun Cheng, Chi Liu, Kaijing Zhou, Zongyuan Ge

    Abstract: Surgical scene perception via videos is critical for advancing robotic surgery, telesurgery, and AI-assisted surgery, particularly in ophthalmology. However, the scarcity of diverse and richly annotated video datasets has hindered the development of intelligent systems for surgical workflow analysis. Existing datasets face challenges such as small scale, lack of diversity in surgery and phase cate… ▽ More

    Submitted 19 July, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by ECCV 2024

  20. arXiv:2406.05763  [pdf, other

    eess.AS

    WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark

    Authors: Linhan Ma, Dake Guo, Kun Song, Yuepeng Jiang, Shuai Wang, Liumeng Xue, Weiming Xu, Huan Zhao, Binbin Zhang, Lei Xie

    Abstract: With the development of large text-to-speech (TTS) models and scale-up of the training data, state-of-the-art TTS systems have achieved impressive performance. In this paper, we present WenetSpeech4TTS, a multi-domain Mandarin corpus derived from the open-sourced WenetSpeech dataset. Tailored for the text-to-speech tasks, we refined WenetSpeech by adjusting segment boundaries, enhancing the audio… ▽ More

    Submitted 19 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH2024

  21. arXiv:2406.05613  [pdf, other

    cs.RO

    Distributed Motion Control of Multiple Mobile Manipulator System with Disturbance and Communication Delay

    Authors: Wenhang Liu, Meng Ren, Kun Song, Michael Yu Wang, Zhenhua Xiong

    Abstract: In real-world object manipulation scenarios, multiple mobile manipulator systems may suffer from disturbances and asynchrony, leading to excessive interaction forces and causing object damage or emergency stops. This paper presents a novel distributed motion control approach aimed at reducing these unnecessary interaction forces. The control strategy only utilizes force information without the nee… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  22. arXiv:2406.05352  [pdf, other

    cs.CV

    1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation

    Authors: Qingfeng Liu, Mostafa El-Khamy, Kee-Bong Song

    Abstract: The third Pixel-level Video Understanding in the Wild (PVUW CVPR 2024) challenge aims to advance the state of art in video understanding through benchmarking Video Panoptic Segmentation (VPS) and Video Semantic Segmentation (VSS) on challenging videos and scenes introduced in the large-scale Video Panoptic Segmentation in the Wild (VIPSeg) test set and the large-scale Video Scene Parsing in the Wi… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  23. arXiv:2406.04941  [pdf, ps, other

    cs.CL

    TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models

    Authors: Ping Yu, Kaitao Song, Fengchen He, Ming Chen, Jianfeng Lu

    Abstract: The recently unprecedented advancements in Large Language Models (LLMs) have propelled the medical community by establishing advanced medical-domain models. However, due to the limited collection of medical datasets, there are only a few comprehensive benchmarks available to gauge progress in this area. In this paper, we introduce a new medical question-answering (QA) dataset that contains massive… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  24. arXiv:2406.03999  [pdf, other

    cs.LG cs.CV

    Unveiling the Dynamics of Information Interplay in Supervised Learning

    Authors: Kun Song, Zhiquan Tan, Bochao Zou, Huimin Ma, Weiran Huang

    Abstract: In this paper, we use matrix information theory as an analytical tool to analyze the dynamics of the information interplay between data representations and classification head vectors in the supervised learning process. Specifically, inspired by the theory of Neural Collapse, we introduce matrix mutual information ratio (MIR) and matrix entropy difference ratio (HDR) to assess the interactions of… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  25. arXiv:2405.20840  [pdf, ps, other

    math.PR

    Convergence rate of the Euler-Maruyama scheme to density dependent SDEs driven by $α$-stable additive noise

    Authors: Ke Song, Zimo Hao

    Abstract: In this paper, we establish the weak convergence rate of density-dependent stochastic differential equations with bounded drift driven by $α$-stable processes with $α\in(1,2)$. The well-posedness of these equations has been previously obtained in \cite{wu2023well}. We derive an explicit convergence rate in total variation for the Euler-Maruyama scheme, employing a technique rooted in \cite{hao2023… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  26. arXiv:2405.19119  [pdf, other

    cs.LG

    Can Graph Learning Improve Task Planning?

    Authors: Xixi Wu, Yifei Shen, Caihua Shan, Kaitao Song, Siwei Wang, Bohang Zhang, Jiarui Feng, Hong Cheng, Wei Chen, Yun Xiong, Dongsheng Li

    Abstract: Task planning is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, t… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  27. arXiv:2405.11747  [pdf, ps, other

    math.AP

    Wolff potentials and nonlocal equations of Lane-Emden type

    Authors: Quoc-Hung Nguyen, Jihoon Ok, Kyeong Song

    Abstract: We consider nonlocal equations of the type \[ (-Δ_{p})^{s}u = μ\quad \text{in }Ω, \] where $Ω\subset \mathbb{R}^{n}$ is either a bounded domain or the whole $\mathbb{R}^{n}$, $μ$ is a Radon measure on $Ω$, $0<s<1$ and $1<p<n/s$. Especially, we extend the existence, regularity and Wolff potential estimates for SOLA (Solutions Obtained as Limits of Approximations), established by Kuusi, Mingione, an… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  28. arXiv:2405.11726  [pdf, other

    cs.RO

    RHAML: Rendezvous-based Hierarchical Architecture for Mutual Localization

    Authors: Gaoming Chen, Kun Song, Xiang Xu, Wenhang Liu, Zhenhua Xiong

    Abstract: Mutual localization serves as the foundation for collaborative perception and task assignment in multi-robot systems. Effectively utilizing limited onboard sensors for mutual localization between marker-less robots is a worthwhile goal. However, due to inadequate consideration of large scale variations of the observed robot and localization refinement, previous work has shown limited accuracy when… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 8 pages, 8 figures, submitted to RA-L

  29. arXiv:2405.08345  [pdf, other

    cs.RO

    Multi-Robot Rendezvous in Unknown Environment with Limited Communication

    Authors: Kun Song, Gaoming Chen, Wenhang Liu, Zhenhua Xiong

    Abstract: Rendezvous aims at gathering all robots at a specific location, which is an important collaborative behavior for multirobot systems. However, in an unknown environment, it is challenging to achieve rendezvous. Previous researches mainly focus on special scenarios where communication is not allowed and each robot executes a random searching strategy, which is highly time-consuming, especially in la… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Submit to RAL. 8 pages, 6 figures

  30. arXiv:2404.19205  [pdf, other

    cs.CV cs.AI

    TableVQA-Bench: A Visual Question Answering Benchmark on Multiple Table Domains

    Authors: Yoonsik Kim, Moonbin Yim, Ka Yeon Song

    Abstract: In this paper, we establish a benchmark for table visual question answering, referred to as the TableVQA-Bench, derived from pre-existing table question-answering (QA) and table structure recognition datasets. It is important to note that existing datasets have not incorporated images or QA pairs, which are two crucial components of TableVQA. As such, the primary objective of this paper is to obta… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Technical Report

  31. arXiv:2404.18252  [pdf, other

    cs.CV

    Fisher Information Improved Training-Free Conditional Diffusion Model

    Authors: Kaiyu Song, Hanjiang Lai

    Abstract: Recently, the diffusion model with the training-free methods has succeeded in conditional image generation tasks. However, there is an efficiency problem because it requires calculating the gradient with high computational cost, and previous methods make strong assumptions to solve it, sacrificing generalization. In this work, we propose the Fisher information guided diffusion model (FIGD). Concre… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  32. arXiv:2404.13694  [pdf, other

    cond-mat.mtrl-sci

    Solute segregation in polycrystalline aluminum from hybrid Monte Carlo and molecular dynamics simulations with a unified neuroevolution potential

    Authors: Keke Song, Jiahui Liu, Shunda Chen, Zheyong Fan, Yanjing Su, Ping Qian

    Abstract: One of the most effective methods to enhance the strength of aluminum alloys involves modifying grain boundaries (GBs) through solute segregation. However, the fundamental mechanisms of solute segregation and their impacts on material properties remain elusive. In this study, we implemented highly efficient hybrid Monte Carlo and molecular dynamics (MCMD) algorithms in the graphics process units m… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures

  33. arXiv:2404.11092  [pdf, ps, other

    econ.EM stat.ME

    Estimation for conditional moment models based on martingale difference divergence

    Authors: Kunyang Song, Feiyu Jiang, Ke Zhu

    Abstract: We provide a new estimation method for conditional moment models via the martingale difference divergence (MDD).Our MDD-based estimation method is formed in the framework of a continuum of unconditional moment restrictions. Unlike the existing estimation methods in this framework, the MDD-based estimation method adopts a non-integrable weighting function, which could grab more information from unc… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  34. arXiv:2404.09531  [pdf, other

    cs.CV cs.GR

    Oblique-MERF: Revisiting and Improving MERF for Oblique Photography

    Authors: Xiaoyi Zeng, Kaiwen Song, Leyuan Yang, Bailin Deng, Juyong Zhang

    Abstract: Neural implicit fields have established a new paradigm for scene representation, with subsequent work achieving high-quality real-time rendering. However, reconstructing 3D scenes from oblique aerial photography presents unique challenges, such as varying spatial scale distributions and a constrained range of tilt angles, often resulting in high memory consumption and reduced rendering quality at… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  35. arXiv:2404.05674  [pdf, other

    cs.CV

    MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

    Authors: Kunpeng Song, Yizhe Zhu, Bingchen Liu, Qing Yan, Ahmed Elgammal, Xiao Yang

    Abstract: In this paper, we present MoMA: an open-vocabulary, training-free personalized image model that boasts flexible zero-shot capabilities. As foundational text-to-image models rapidly evolve, the demand for robust image-to-image translation grows. Addressing this need, MoMA specializes in subject-driven personalized image generation. Utilizing an open-source, Multimodal Large Language Model (MLLM), w… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  36. arXiv:2404.02117  [pdf, other

    cs.CV

    Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners

    Authors: Keon-Hee Park, Kyungwoo Song, Gyeong-Moon Park

    Abstract: Few-Shot Class Incremental Learning (FSCIL) is a task that requires a model to learn new classes incrementally without forgetting when only a few samples for each class are given. FSCIL encounters two significant challenges: catastrophic forgetting and overfitting, and these challenges have driven prior studies to primarily rely on shallow models, such as ResNet-18. Even though their limited capac… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  37. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  38. arXiv:2404.01706  [pdf, other

    cs.CL

    Polarity Calibration for Opinion Summarization

    Authors: Yuanyuan Lei, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Ruihong Huang, Dong Yu

    Abstract: Opinion summarization is automatically generating summaries from a variety of subjective information, such as product reviews or political opinions. The challenge of opinions summarization lies in presenting divergent or even conflicting opinions. We conduct an analysis of previous summarization models, which reveals their inclination to amplify the polarity bias, emphasizing the majority opinions… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024

  39. arXiv:2403.19833  [pdf, other

    cs.NI cs.AI

    ChatTracer: Large Language Model Powered Real-time Bluetooth Device Tracking System

    Authors: Qijun Wang, Shichen Zhang, Kunzhe Song, Huacheng Zeng

    Abstract: Large language models (LLMs) have transformed the way we interact with cyber technologies. In this paper, we study the possibility of connecting LLM with wireless sensor networks (WSN). A successful design will not only extend LLM's knowledge landscape to the physical world but also revolutionize human interaction with WSN. To the end, we present ChatTracer, an LLM-powered real-time Bluetooth devi… ▽ More

    Submitted 9 July, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  40. arXiv:2403.15187  [pdf, ps, other

    hep-ph nucl-th

    Spectrum of $S$- and $P$-wave $cc\bar{q}\bar{q}'$ $(\bar{q},\bar{q}' = \bar{u}, \bar{d}, \bar{s})$ systems in a chiral SU(3) quark model

    Authors: Du Wang, Ke-Rang Song, Wen-Ling Wang, Fei Huang

    Abstract: Inspired by the resonance $T_{cc}^+(3875)$ recently observed by the LHCb Collaboration, we systematically explore the $S$- and $P$-wave $cc\bar{q}\bar{q}'$ $(\bar{q},\bar{q}' = \bar{u}, \bar{d}, \bar{s})$ systems in a chiral SU(3) quark model. The Hamiltonian contains the kinetic energy, the one-gluon-exchange (OGE) potential, the confinement potential, and the one-boson-exchange (OBE) potential s… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 9 pages, 2 figures

  41. arXiv:2403.10558  [pdf, other

    cs.CV cs.CR cs.LG

    Adaptive Hybrid Masking Strategy for Privacy-Preserving Face Recognition Against Model Inversion Attack

    Authors: Yinggui Wang, Yuanqing Huang, Jianshu Li, Le Yang, Kai Song, Lei Wang

    Abstract: The utilization of personal sensitive data in training face recognition (FR) models poses significant privacy concerns, as adversaries can employ model inversion attacks (MIA) to infer the original training data. Existing defense methods, such as data augmentation and differential privacy, have been employed to mitigate this issue. However, these methods often fail to strike an optimal balance bet… ▽ More

    Submitted 23 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  42. arXiv:2403.09073  [pdf, other

    cs.CL

    Large Language Models are Parallel Multilingual Learners

    Authors: Yongyu Mu, Peinan Feng, Zhiquan Cao, Yuzhang Wu, Bei Li, Chenglong Wang, Tong Xiao, Kai Song, Tongran Liu, Chunliang Zhang, Jingbo Zhu

    Abstract: In this study, we reveal an in-context learning (ICL) capability of multilingual large language models (LLMs): by translating the input to several languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which significantly enhances their comprehension abilities. To test this capability, we design extensive experiments encompassing 8 typical datasets, 7 languages and 8 state-of-th… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Working in process

  43. arXiv:2403.08964  [pdf

    physics.bio-ph physics.med-ph

    Hyperelasticity of Blood Clots: Bridging the Gap between Microscopic and Continuum Scales

    Authors: Nicholas Filla, Beikang Gu, Jixin Hou, Kenan Song, He Li, Ning Liu, Xianqiao Wang

    Abstract: The biomechanical properties of blood clots, which are dictated by their compositions and micro-structures, play a critical role in determining their fates, occlusion, persistency, or embolization in the human circulatory system. While numerous constitutive models have emerged to describe the biomechanics of blood clots, the majority of these models have primarily focused on the macroscopic deform… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 13 figures

  44. arXiv:2403.08827  [pdf, other

    math.OC

    Locational Scenario-based Pricing in a Bilateral Distribution Energy Market under Uncertainty

    Authors: Hien Thanh Doan, Minsoo Kim, Keunju Song, Hongseok Kim

    Abstract: In recent years, there has been a significant focus on advancing the next generation of power systems. Despite these efforts, persistent challenges revolve around addressing the operational impact of uncertainty on predicted data, especially concerning economic dispatch and optimal power flow. To tackle these challenges, we introduce a stochastic day-ahead scheduling approach for a community. This… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  45. arXiv:2403.05952  [pdf

    physics.app-ph cond-mat.mtrl-sci

    New Directions for Thermoelectrics: A Roadmap from High-Throughput Materials Discovery to Advanced Device Manufacturing

    Authors: Kaidong Song, A. N. M. Tanvir, Md Omarsany Bappy, Yanliang Zhang

    Abstract: Thermoelectric materials, which can convert waste heat into electricity or act as solid-state Peltier coolers, are emerging as key technologies to address global energy shortages and environmental sustainability. However, discovering materials with high thermoelectric conversion efficiency is a complex and slow process. The emerging field of high-throughput material discovery demonstrates its pote… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  46. arXiv:2403.04031  [pdf, other

    cs.CL cs.AI

    Can Large Language Models do Analytical Reasoning?

    Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu

    Abstract: This paper explores the cutting-edge Large Language Model with analytical reasoning on sports. Our analytical reasoning embodies the tasks of letting large language models count how many points each team scores in a quarter in the NBA and NFL games. Our major discoveries are in two folds. Firstly, we find among all the models we employed, GPT-4 stands out in effectiveness, followed by Claude-2.1,… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  47. arXiv:2403.03100  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

    Authors: Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao

    Abstract: While recent large-scale text-to-speech (TTS) models have achieved significant progress, they still fall short in speech quality, similarity, and prosody. Considering speech intricately encompasses various attributes (e.g., content, prosody, timbre, and acoustic details) that pose significant challenges for generation, a natural idea is to factorize speech into individual subspaces representing di… ▽ More

    Submitted 23 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Achieving human-level quality and naturalness on multi-speaker datasets (e.g., LibriSpeech) in a zero-shot way

  48. arXiv:2402.14279  [pdf, other

    cs.CL cs.AI

    Mitigating the Linguistic Gap with Phonemic Representations for Robust Multilingual Language Understanding

    Authors: Haeji Jung, Changdae Oh, Jooeon Kang, Jimin Sohn, Kyungwoo Song, Jinkyu Kim, David R. Mortensen

    Abstract: Approaches to improving multilingual language understanding often require multiple languages during the training phase, rely on complicated training techniques, and -- importantly -- struggle with significant performance gaps between high-resource and low-resource languages. We hypothesize that the performance gaps between languages are affected by linguistic gaps between those languages and provi… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  49. arXiv:2402.10979  [pdf, other

    cs.CL cs.AI

    SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs

    Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu

    Abstract: Large language models hold significant potential for integrating various data types, such as text documents and database records, for advanced analytics. However, blending text and numerical data presents substantial challenges. LLMs need to process and cross-reference entities and numbers, handle data inconsistencies and redundancies, and develop planning capabilities such as building a working m… ▽ More

    Submitted 16 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Long Paper

  50. arXiv:2402.10047  [pdf, other

    physics.chem-ph

    OH-Formation Following Vibrationally Induced Reaction Dynamics of H$_2$COO

    Authors: Kaisheng Song, Meenu Upadhyay, Markus Meuwly

    Abstract: The reaction dynamics of H$_2$COO to form linear HCOOH and dioxirane as first steps for OH-elimination is quantitatively investigated. Using a machine learned potential energy surface at the CASPT2/aug-cc-pVTZ level of theory vibrational excitation along the CH-normal mode $ν_{\rm CH}$ with energies up to 40.0 kcal/mol ($\sim 5 ν_{\rm CH}$) leads almost exclusively to linear HCOOH which further de… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.