Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 675 results for author: Han, B

.
  1. arXiv:2407.19739  [pdf

    cs.NI cs.ET

    Nomadic Non-Public Networks for 6G: Use Cases and Key Performance Indicators

    Authors: Daniel Lindenschmitt, Benedikt Veith, Khurshid Alam, Ainur Aurembekova, Michael Gundall, Mohammad Asif Habibi, Bin Han, Dennis Krummacker, Philipp Rosemann, Hans D. Schotten

    Abstract: The landscape of wireless communication systems is evolving rapidly, with a pivotal role envisioned for dynamic network structures and self-organizing networks in upcoming technologies like the 6G mobile communications standard. This evolution is fueled by the growing demand from diverse sectors, including industry, manufacturing, agriculture, and the public sector, each with increasingly specific… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: 8 pages, 1 figure

  2. arXiv:2407.17710  [pdf, other

    cs.LG

    Revisiting Machine Unlearning with Dimensional Alignment

    Authors: Seonguk Seo, Dongwan Kim, Bohyung Han

    Abstract: Machine unlearning, an emerging research topic focusing on compliance with data privacy regulations, enables trained models to remove the information learned from specific data. While many existing methods indirectly address this issue by intentionally injecting incorrect supervisions, they can drastically and unpredictably alter the decision boundaries and feature spaces, leading to training inst… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  3. arXiv:2407.14902  [pdf, other

    cond-mat.mes-hall

    Infrared magneto-polaritons in MoTe$_2$ mono- and bilayers

    Authors: Bo Han, Jamie M. Fitzgerald, Lukas Lackner, Roberto Rosati, Martin Esmann, Falk Eilenberger, Takashi Taniguchi, Kenji Watanabe, Marcin Syperek, Ermin Malic, Christian Schneider

    Abstract: MoTe$_2$ monolayers and bilayers are unique within the family of van-der-Waals materials since they pave the way towards atomically thin infrared light-matter quantum interfaces, potentially reaching the important telecommunication windows. Here, we report emergent exciton-polaritons based on MoTe$_2$ monolayer and bilayer in a low-temperature open micro-cavity in a joint experiment-theory study.… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: 6 pages, 3 figures

  4. arXiv:2407.14841  [pdf, other

    cs.CV

    Text-based Talking Video Editing with Cascaded Conditional Diffusion

    Authors: Bo Han, Heqing Zou, Haoyang Li, Guangcong Wang, Chng Eng Siong

    Abstract: Text-based talking-head video editing aims to efficiently insert, delete, and substitute segments of talking videos through a user-friendly text editing approach. It is challenging because of \textbf{1)} generalizable talking-face representation, \textbf{2)} seamless audio-visual transitions, and \textbf{3)} identity-preserved talking faces. Previous works either require minutes of talking-face vi… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  5. arXiv:2407.14129  [pdf, other

    cs.LG

    Comparing and Contrasting Deep Learning Weather Prediction Backbones on Navier-Stokes and Atmospheric Dynamics

    Authors: Matthias Karlbauer, Danielle C. Maddix, Abdul Fatir Ansari, Boran Han, Gaurav Gupta, Yuyang Wang, Andrew Stuart, Michael W. Mahoney

    Abstract: Remarkable progress in the development of Deep Learning Weather Prediction (DLWP) models positions them to become competitive with traditional numerical weather prediction (NWP) models. Indeed, a wide number of DLWP architectures -- based on various backbones, including U-Net, Transformer, Graph Neural Network (GNN), and Fourier Neural Operator (FNO) -- have demonstrated their potential at forecas… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  6. arXiv:2407.12076  [pdf, ps, other

    math.CO

    Colored Multiset Eulerian Polynomials

    Authors: Danai Deligeorgaki, Bin Han, Liam Solus

    Abstract: Colored multiset Eulerian polynomials are a common generalization of MacMahon's multiset Eulerian polynomials and the colored Eulerian polynomials, both of which are known to satisfy well-studied distributional properties including real-rootedness, log-concavity and unimodality. The symmetric colored multiset Eulerian polynomials are characterized and used to prove sufficient conditions for a colo… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 23 pages

  7. arXiv:2407.08551  [pdf, other

    cs.CL cs.SD eess.AS

    Autoregressive Speech Synthesis without Vector Quantization

    Authors: Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen Meng, Furu Wei

    Abstract: We present MELLE, a novel continuous-valued tokens based language modeling approach for text to speech synthesis (TTS). MELLE autoregressively generates continuous mel-spectrogram frames directly from text condition, bypassing the need for vector quantization, which are originally designed for audio compression and sacrifice fidelity compared to mel-spectrograms. Specifically, (i) instead of cross… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  8. arXiv:2407.06827  [pdf, ps, other

    math.PR math.AP

    On the support of solutions to nonlinear stochastic heat equations

    Authors: Beom-Seok Han, Kunwoo Kim, Jaeyun Yi

    Abstract: We investigate the strict positivity and the compact support property of solutions to the one-dimensional nonlinear stochastic heat equation: $$\partial_t u(t,x) = \frac{1}{2}\partial^2_x u(t,x) + σ(u(t,x))\dot{W}(t,x), \quad (t,x)\in \mathbf{R}_+\times\mathbf{R},$$ with nonnegative and compactly supported initial data $u_0$, where $\dot{W}$ is the space-time white noise and… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 31 pages

    MSC Class: 60H15; 35R60

  9. arXiv:2407.04557  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates

    Authors: Ryotaro Okabe, Mouyang Cheng, Abhijatmedhi Chotrattanapituk, Nguyen Tuan Hung, Xiang Fu, Bowen Han, Yao Wang, Weiwei Xie, Robert J. Cava, Tommi S. Jaakkola, Yongqiang Cheng, Mingda Li

    Abstract: Billions of organic molecules are known, but only a tiny fraction of the functional inorganic materials have been discovered, a particularly relevant problem to the community searching for new quantum materials. Recent advancements in machine-learning-based generative models, particularly diffusion models, show great promise for generating new, stable materials. However, integrating geometric patt… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 512 pages total, 4 main figures + 218 supplementary figures

  10. arXiv:2407.04029  [pdf, other

    cs.LG

    Robust Learning under Hybrid Noise

    Authors: Yang Wei, Shuo Chen, Shanshan Ye, Bo Han, Chen Gong

    Abstract: Feature noise and label noise are ubiquitous in practical scenarios, which pose great challenges for training a robust machine learning model. Most previous approaches usually deal with only a single problem of either feature noise or label noise. However, in real-world applications, hybrid noise, which contains both feature noise and label noise, is very common due to the unreliable data collecti… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  11. arXiv:2407.00750  [pdf, other

    cs.CR cs.IT

    Physical Layer Deception with Non-Orthogonal Multiplexing

    Authors: Wenwen Chen, Bin Han, Yao Zhu, Anke Schmeink, Giuseppe Caire, Hans D. Schotten

    Abstract: Physical layer security (PLS) is a promising technology to secure wireless communications by exploiting the physical properties of the wireless channel. However, the passive nature of PLS creates a significant imbalance between the effort required by eavesdroppers and legitimate users to secure data. To address this imbalance, in this article, we propose a novel framework of physical layer decepti… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Submitted to IEEE Transactions on Wireless Communications

  12. arXiv:2406.19930  [pdf

    cs.RO cs.MA

    Exploring 6G Potential for Industrial Digital Twinning and Swarm Intelligence in Obstacle-Rich Environments

    Authors: Siyu Yuan, Khurshid Alam, Bin Han, Dennis Krummacker, Hans D. Schotten

    Abstract: With the advent of 6G technology, the demand for efficient and intelligent systems in industrial applications has surged, driving the need for advanced solutions in target localization. Utilizing swarm robots to locate unknown targets involves navigating increasingly complex environments. Digital Twinning (DT) offers a robust solution by creating a virtual replica of the physical world, which enha… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE VTM

  13. arXiv:2406.18269  [pdf, other

    physics.chem-ph physics.comp-ph

    Refining Potential Energy Surface through Dynamical Properties via Differentiable Molecular Simulation

    Authors: Bin Han, Kuang Yu

    Abstract: Recently, machine learning potentials (MLP) largely enhances the reliability of molecular dynamics, but its accuracy is limited by the underlying $\textit{ab initio}$ methods. A viable approach to overcome this limitation is to refine the potential by learning from experimental data, which now can be done efficiently using modern automatic differentiation technique. However, potential refinement i… ▽ More

    Submitted 27 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  14. arXiv:2406.17196  [pdf, other

    cs.IT eess.SY

    Coded Kalman Filtering over MIMO Gaussian Channels with Feedback

    Authors: Barron Han, Oron Sabag, Victoria Kostina, Babak Hassibi

    Abstract: We consider the problem of remotely stabilizing a linear dynamical system. In this setting, a sensor co-located with the system communicates the system's state to a controller over a noisy communication channel with feedback. The objective of the controller (decoder) is to use the channel outputs to estimate the vector state with finite zero-delay mean squared error (MSE) at the infinite horizon.… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted for presentation at the 2024 IEEE International Symposium on Information Theory

  15. arXiv:2406.13918  [pdf, other

    cs.HC

    Are We There Yet? Unravelling Usability Challenges and Opportunities in Collaborative Immersive Analytics for Domain Experts

    Authors: Fahim Arsad Nafis, Alexander Rose, Simon Su, Songqing Chen, Bo Han

    Abstract: In the ever-evolving discipline of high-dimensional scientific data, collaborative immersive analytics (CIA) offers a promising frontier for domain experts in complex data visualization and interpretation. This research presents a comprehensive framework for conducting usability studies on the extended reality (XR) interface of ParaView, an open-source CIA system. By employing established human-co… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted in 26th International Conference on Human-Computer Interaction, HCII 2024, Washington, DC, USA

  16. arXiv:2406.11824  [pdf, other

    cs.CV

    Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation

    Authors: Alexander Raistrick, Lingjie Mei, Karhan Kayan, David Yan, Yiming Zuo, Beining Han, Hongyu Wen, Meenal Parakh, Stamatis Alexandropoulos, Lahav Lipson, Zeyu Ma, Jia Deng

    Abstract: We introduce Infinigen Indoors, a Blender-based procedural generator of photorealistic indoor scenes. It builds upon the existing Infinigen system, which focuses on natural scenes, but expands its coverage to indoor scenes by introducing a diverse library of procedural indoor assets, including furniture, architecture elements, appliances, and other day-to-day objects. It also introduces a constrai… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to CVPR 2024

  17. arXiv:2406.11793  [pdf, other

    cs.RO

    FetchBench: A Simulation Benchmark for Robot Fetching

    Authors: Beining Han, Meenal Parakh, Derek Geng, Jack A Defay, Luyang Gan, Jia Deng

    Abstract: Fetching, which includes approaching, grasping, and retrieving, is a critical challenge for robot manipulation tasks. Existing methods primarily focus on table-top scenarios, which do not adequately capture the complexities of environments where both grasping and planning are essential. To address this gap, we propose a new benchmark FetchBench, featuring diverse procedural scenes that integrate b… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  18. arXiv:2406.11364  [pdf, other

    cs.SD eess.AS

    AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

    Authors: Anbai Jiang, Bing Han, Zhiqiang Lv, Yufeng Deng, Wei-Qiang Zhang, Xie Chen, Yanmin Qian, Jia Liu, Pingyi Fan

    Abstract: Large pre-trained models have demonstrated dominant performances in multiple areas, where the consistency between pre-training and fine-tuning is the key to success. However, few works reported satisfactory results of pre-trained models for the machine anomalous sound detection (ASD) task. This may be caused by the inconsistency of the pre-trained model and the inductive bias of machine audio, res… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  19. arXiv:2406.10881  [pdf, other

    cs.CL

    Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals

    Authors: Lida Chen, Zujie Liang, Xintao Wang, Jiaqing Liang, Yanghua Xiao, Feng Wei, Jinglei Chen, Zhenghong Hao, Bing Han, Wei Wang

    Abstract: Large language models (LLMs) have achieved great success, but their occasional content fabrication, or hallucination, limits their practical application. Hallucination arises because LLMs struggle to admit ignorance due to inadequate training on knowledge boundaries. We call it a limitation of LLMs that they can not accurately express their knowledge boundary, answering questions they know while a… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  20. arXiv:2406.09179  [pdf, other

    cs.LG

    Unlearning with Control: Assessing Real-world Utility for Large Language Model Unlearning

    Authors: Qizhou Wang, Bo Han, Puning Yang, Jianing Zhu, Tongliang Liu, Masashi Sugiyama

    Abstract: The compelling goal of eradicating undesirable data behaviors, while preserving usual model functioning, underscores the significance of machine unlearning within the domain of large language models (LLMs). Recent research has begun to approach LLM unlearning via gradient ascent (GA) -- increasing the prediction risk for those training strings targeted to be unlearned, thereby erasing their parame… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  21. Contextual Distillation Model for Diversified Recommendation

    Authors: Fan Li, Xu Si, Shisong Tang, Dingmin Wang, Kunyan Han, Bing Han, Guorui Zhou, Yang Song, Hechang Chen

    Abstract: The diversity of recommendation is equally crucial as accuracy in improving user experience. Existing studies, e.g., Determinantal Point Process (DPP) and Maximal Marginal Relevance (MMR), employ a greedy paradigm to iteratively select items that optimize both accuracy and diversity. However, prior methods typically exhibit quadratic complexity, limiting their applications to the re-ranking stage… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: accepted by KDD 2024

  22. arXiv:2406.08288  [pdf, other

    cs.LG

    Decoupling the Class Label and the Target Concept in Machine Unlearning

    Authors: Jianing Zhu, Bo Han, Jiangchao Yao, Jianliang Xu, Gang Niu, Masashi Sugiyama

    Abstract: Machine unlearning as an emerging research topic for data regulations, aims to adjust a trained model to approximate a retrained one that excludes a portion of training data. Previous studies showed that class-wise unlearning is successful in forgetting the knowledge of a target class, through gradient ascent on the forgetting data or fine-tuning with the remaining data. However, while these metho… ▽ More

    Submitted 16 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  23. arXiv:2406.07955  [pdf, other

    cs.LG stat.ML

    How Interpretable Are Interpretable Graph Neural Networks?

    Authors: Yongqiang Chen, Yatao Bian, Bo Han, James Cheng

    Abstract: Interpretable graph neural networks (XGNNs ) are widely adopted in various scientific applications involving graph-structured data. Existing XGNNs predominantly adopt the attention-based mechanism to learn edge or node importance for extracting and making predictions with the interpretable subgraph. However, the representational properties and limitations of these methods remain inadequately explo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: ICML2024, 44 pages, 21 figures, 12 tables

  24. arXiv:2406.07855  [pdf, other

    cs.CL cs.SD eess.AS

    VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

    Authors: Bing Han, Long Zhou, Shujie Liu, Sanyuan Chen, Lingwei Meng, Yanming Qian, Yanqing Liu, Sheng Zhao, Jinyu Li, Furu Wei

    Abstract: With the help of discrete neural audio codecs, large language models (LLM) have increasingly been recognized as a promising methodology for zero-shot Text-to-Speech (TTS) synthesis. However, sampling based decoding strategies bring astonishing diversity to generation, but also pose robustness issues such as typos, omissions and repetition. In addition, the high sampling rate of audio also brings h… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 5 figures

  25. arXiv:2406.07337  [pdf, other

    cs.LG

    Transferring Knowledge from Large Foundation Models to Small Downstream Models

    Authors: Shikai Qiu, Boran Han, Danielle C. Maddix, Shuai Zhang, Yuyang Wang, Andrew Gordon Wilson

    Abstract: How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited information and commits us to often massive pre-trained architectures. This procedure also precludes combining multiple pre-trained models that learn co… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ICML 2024. Code available at https://github.com/amazon-science/adaptive-feature-transfer

  26. arXiv:2406.07006  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

    Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  27. arXiv:2406.04496  [pdf, other

    cs.CL cs.AI cs.LG

    Time Sensitive Knowledge Editing through Efficient Finetuning

    Authors: Xiou Ge, Ali Mousavi, Edouard Grave, Armand Joulin, Kun Qian, Benjamin Han, Mostafa Arefiyan, Yunyao Li

    Abstract: Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to design effective methods to both update obsolete knowledge and induce new knowledge into LLMs. Existing locate-and-edit knowledge e… ▽ More

    Submitted 22 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: ACL 2024 main

  28. arXiv:2406.03631  [pdf, other

    cs.LG

    Discovering Bias in Latent Space: An Unsupervised Debiasing Approach

    Authors: Dyah Adila, Shuai Zhang, Boran Han, Yuyang Wang

    Abstract: The question-answering (QA) capabilities of foundation models are highly sensitive to prompt variations, rendering their performance susceptible to superficial, non-meaning-altering changes. This vulnerability often stems from the model's preference or bias towards specific input characteristics, such as option position or superficial image features in multi-modal settings. We propose to rectify t… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Journal ref: ICML 2024

  29. arXiv:2406.00806  [pdf, other

    cs.LG

    Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection

    Authors: Chentao Cao, Zhun Zhong, Zhanke Zhou, Yang Liu, Tongliang Liu, Bo Han

    Abstract: Detecting out-of-distribution (OOD) samples is essential when deploying machine learning models in open-world scenarios. Zero-shot OOD detection, requiring no training on in-distribution (ID) data, has been possible with the advent of vision-language models like CLIP. Existing methods build a text-based classifier with only closed-set labels. However, this largely restricts the inherent capability… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  30. arXiv:2405.19919  [pdf, other

    cs.LG cs.SI

    Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning

    Authors: Yuhao Wu, Jiangchao Yao, Bo Han, Lina Yao, Tongliang Liu

    Abstract: While Positive-Unlabeled (PU) learning is vital in many real-world scenarios, its application to graph data still remains under-explored. We unveil that a critical challenge for PU learning on graph lies on the edge heterophily, which directly violates the irreducibility assumption for Class-Prior Estimation (class prior is essential for building PU learning algorithms) and degenerates the latent… ▽ More

    Submitted 1 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  31. arXiv:2405.18972  [pdf, other

    cs.LG cs.DC

    Federated Learning with Bilateral Curation for Partially Class-Disjoint Data

    Authors: Ziqing Fan, Ruipeng Zhang, Jiangchao Yao, Bo Han, Ya Zhang, Yanfeng Wang

    Abstract: Partially class-disjoint data (PCDD), a common yet under-explored data formation where each client contributes a part of classes (instead of all classes) of samples, severely challenges the performance of federated algorithms. Without full classes, the local objective will contradict the global objective, yielding the angle collapse problem for locally missing classes and the space waste problem f… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  32. arXiv:2405.18786  [pdf, other

    cs.LG cs.CV

    MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence

    Authors: Hongduan Tian, Feng Liu, Tongliang Liu, Bo Du, Yiu-ming Cheung, Bo Han

    Abstract: In cross-domain few-shot classification, \emph{nearest centroid classifier} (NCC) aims to learn representations to construct a metric space where few-shot classification can be performed by measuring the similarities between samples and the prototype of each class. An intuition behind NCC is that each sample is pulled closer to the class centroid it belongs to while pushed away from those of other… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  33. arXiv:2405.16996  [pdf, other

    cs.CV

    Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning

    Authors: Zihua Zhao, Mengxi Chen, Tianjie Dai, Jiangchao Yao, Bo han, Ya Zhang, Yanfeng Wang

    Abstract: Noisy correspondence that refers to mismatches in cross-modal data pairs, is prevalent on human-annotated or web-crawled datasets. Prior approaches to leverage such data mainly consider the application of uni-modal noisy label learning without amending the impact on both cross-modal and intra-modal geometrical structures in multimodal learning. Actually, we find that both structures are effective… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 10 pages, 5 figures, received by IEEE/CVF Computer Science and Pattern Recognition

  34. arXiv:2405.16820  [pdf, other

    cs.LG cs.AI cs.CY cs.HC

    Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings

    Authors: Robert Wolfe, Isaac Slaughter, Bin Han, Bingbing Wen, Yiwei Yang, Lucas Rosenblatt, Bernease Herman, Eva Brown, Zening Qu, Nic Weber, Bill Howe

    Abstract: The rapid proliferation of generative AI has raised questions about the competitiveness of lower-parameter, locally tunable, open-weight models relative to high-parameter, API-guarded, closed-weight models in terms of performance, domain adaptation, cost, and generalization. Centering under-resourced yet risk-intolerant settings in government, research, and healthcare, we see for-profit closed-wei… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted at the ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2024

  35. arXiv:2405.16262  [pdf, other

    cs.LG

    Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency

    Authors: Runqi Lin, Chaojian Yu, Bo Han, Hang Su, Tongliang Liu

    Abstract: Catastrophic overfitting (CO) presents a significant challenge in single-step adversarial training (AT), manifesting as highly distorted deep neural networks (DNNs) that are vulnerable to multi-step adversarial attacks. However, the underlying factors that lead to the distortion of decision boundaries remain unclear. In this work, we delve into the specific changes within different DNN layers and… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  36. arXiv:2405.12698  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el

    Correlated magnetism of moiré exciton-polaritons on a triangular electron-spin lattice

    Authors: Johannes Scherzer, Lukas Lackner, Bo Han, Borislav Polovnikov, Lukas Husel, Jonas Göser, Zhijie Li, Jens-Christian Drawer, Martin Esmann, Christoph Bennenhei, Falk Eilenberger, Kenji Watanabe, Takashi Taniguchi, Anvar S. Baimuratov, Christian Schneider, Alexander Högele

    Abstract: We demonstrate evidence of correlated magnetism for exciton-polaritons in a MoSe$_{2}$/WS$_{2}$ moiré heterostructure with near-parallel alignment subject to electron doping. In our experiments, interactions between electrons and moiré excitons are controlled electrostatically by field-effect doping, and the polaritonic regime of strong light-matter coupling is established in an open cryogenic mic… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  37. arXiv:2405.11969  [pdf, ps, other

    math.PR math.AP

    Sobolev regularity theory for stochastic reaction-diffusion-advection equations with spatially homogeneous colored noises and variable-order nonlocal operators

    Authors: Jae-Hwan Choi, Beom-Seok Han, Daehan Park

    Abstract: This article investigates the existence, uniqueness, and regularity of solutions to nonlinear stochastic reaction-diffusion-advection equations (SRDAEs) with spatially homogeneous colored noises and variable-order nonlocal operators in mixed norm $L_q(L_p)$-spaces. We introduce a new condition (strongly reinforced Dalang's condition) on colored noise, which facilitates a deeper understanding of th… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 47 pages

    MSC Class: 60H15; 35R60

  38. arXiv:2405.11473  [pdf, other

    cs.CV cs.AI

    FIFO-Diffusion: Generating Infinite Videos from Text without Training

    Authors: Jihwan Kim, Junoh Kang, Jinyoung Choi, Bohyung Han

    Abstract: We propose a novel inference technique based on a pretrained diffusion model for text-conditional video generation. Our approach, called FIFO-Diffusion, is conceptually capable of generating infinitely long videos without additional training. This is achieved by iteratively performing diagonal denoising, which concurrently processes a series of consecutive frames with increasing noise levels in a… ▽ More

    Submitted 12 June, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: Project Page: https://jjihwan.github.io/projects/FIFO-Diffusion

  39. arXiv:2405.10422  [pdf, other

    cs.NI

    A First Look at Immersive Telepresence on Apple Vision Pro

    Authors: Ruizhi Cheng, Nan Wu, Matteo Varvello, Eugene Chai, Songqing Chen, Bo Han

    Abstract: Due to the widespread adoption of "work-from-home" policies, videoconferencing applications (e.g., Zoom) have become indispensable for remote communication. However, these systems lack immersiveness, leading to the so-called "Zoom fatigue" and degrading communication efficiency. The recent debut of Apple Vision Pro, a mixed reality headset that supports "spatial persona", aims to offer an immersiv… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  40. arXiv:2405.09892  [pdf, other

    cs.LG cs.DC

    Balancing Similarity and Complementarity for Federated Learning

    Authors: Kunda Yan, Sen Cui, Abudukelimu Wuerkaixi, Jingfeng Zhang, Bo Han, Gang Niu, Masashi Sugiyama, Changshui Zhang

    Abstract: In mobile and IoT systems, Federated Learning (FL) is increasingly important for effectively using data while maintaining user privacy. One key challenge in FL is managing statistical heterogeneity, such as non-i.i.d. data, arising from numerous clients and diverse data sources. This requires strategic cooperation, often with clients having similar characteristics. However, we are interested in a… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  41. arXiv:2405.09245  [pdf, other

    eess.SP

    A Robust UAV-Based Approach for Power-Modulated Jammer Localization Using DoA

    Authors: Zexin Fang, Bin Han, Hans D. Schotten

    Abstract: Unmanned aerial vehicles (UAVs) are well-suited to localize jammers, particularly when jammers are at non-terrestrial locations, where conventional detection methods face challenges. In this work we propose a novel localization method, sample pruning gradient descend (SPGD), which offers robust performance against multiple power-modulated jammers with low computational complexity.

    Submitted 21 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: Submitted to the 2024 IEEE 100th Vehicular Technology Conference (VTC2024-Fall)

  42. arXiv:2405.07780  [pdf, other

    cs.LG cs.AI cs.CV

    Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition

    Authors: Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, Qingming Huang

    Abstract: This paper explores test-agnostic long-tail recognition, a challenging long-tail task where the test label distributions are unknown and arbitrarily imbalanced. We argue that the variation in these distributions can be broken down hierarchically into global and local levels. The global ones reflect a broad range of diversity, while the local ones typically arise from milder changes, often focused… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  43. arXiv:2405.04867  [pdf, other

    eess.IV cs.CV

    MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

    Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng , et al. (24 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

  44. arXiv:2404.17729  [pdf, other

    cs.CL

    CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving

    Authors: Pei Chen, Boran Han, Shuai Zhang

    Abstract: Large Language Models (LLMs) have shown great ability in solving traditional natural language tasks and elementary reasoning tasks with appropriate prompting techniques. However, their ability is still limited in solving complicated science problems. In this work, we aim to push the upper bound of the reasoning capability of LLMs by proposing a collaborative multi-agent, multi-reasoning-path (CoMM… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024

  45. arXiv:2404.16880  [pdf, other

    q-bio.QM cs.AI cs.CL

    Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation

    Authors: Yikun Zhang, Geyan Ye, Chaohao Yuan, Bo Han, Long-Kai Huang, Jianhua Yao, Wei Liu, Yu Rong

    Abstract: Molecule-and-text cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation, thereby improving performance in various scientific fields, including drug discovery and materials science. Existing studies adopt a global alignment approach to learn the knowledge from different modalities. These global alignment approaches fail to cap… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  46. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  47. arXiv:2404.12886  [pdf, other

    cs.CV cs.LG

    MCM: Multi-condition Motion Synthesis Framework

    Authors: Zeyu Ling, Bo Han, Yongkang Wongkan, Han Lin, Mohan Kankanhalli, Weidong Geng

    Abstract: Conditional human motion synthesis (HMS) aims to generate human motion sequences that conform to specific conditions. Text and audio represent the two predominant modalities employed as HMS control conditions. While existing research has primarily focused on single conditions, the multi-condition human motion synthesis remains underexplored. In this study, we propose a multi-condition HMS framewor… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Journal ref: International Joint Conference on Artificial Intelligence 2024

  48. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  49. arXiv:2404.09960  [pdf, other

    stat.ME stat.AP

    Pseudo P-values for Assessing Covariate Balance in a Finite Study Population with Application to the California Sugar Sweetened Beverage Tax Study

    Authors: Bing Han, Margo A. Sidell

    Abstract: Assessing covariate balance (CB) is a common practice in various types of evaluation studies. Two-sample descriptive statistics, such as the standardized mean difference, have been widely applied in the scientific literature to assess the goodness of CB. Studies in health policy, health services research, built and social environment research, and many other fields often involve a finite number of… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 26 pages in total, 2 figures, 6 tables

  50. arXiv:2404.09790  [pdf, other

    cs.CV

    NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results

    Authors: Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou , et al. (63 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained. The challenge involves generating corresponding high-resolution (HR) images, magnified by a factor of four, from low-resolution (LR) inputs using prior information. The LR images originate from bicubic downsampling degradation. The aim of the challenge i… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: NTIRE 2024 webpage: https://cvlai.net/ntire/2024. Code: https://github.com/zhengchen1999/NTIRE2024_ImageSR_x4