Search | arXiv e-print repository

arXiv:2407.19156 [pdf, other]

Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble

Authors: Juhan Cha, Minseok Joo, Jihwan Park, Sanghyeok Lee, Injae Kim, Hyunwoo J. Kim

Abstract: Recent advancements in 3D object detection have benefited from multi-modal information from the multi-view cameras and LiDAR sensors. However, the inherent disparities between the modalities pose substantial challenges. We observe that existing multi-modal 3D object detection methods heavily rely on the LiDAR sensor, treating the camera as an auxiliary modality for augmenting semantic details. Thi… ▽ More Recent advancements in 3D object detection have benefited from multi-modal information from the multi-view cameras and LiDAR sensors. However, the inherent disparities between the modalities pose substantial challenges. We observe that existing multi-modal 3D object detection methods heavily rely on the LiDAR sensor, treating the camera as an auxiliary modality for augmenting semantic details. This often leads to not only underutilization of camera data but also significant performance degradation in scenarios where LiDAR data is unavailable. Additionally, existing fusion methods overlook the detrimental impact of sensor noise induced by environmental changes, on detection performance. In this paper, we propose MEFormer to address the LiDAR over-reliance problem by harnessing critical information for 3D object detection from every available modality while concurrently safeguarding against corrupted signals during the fusion process. Specifically, we introduce Modality Agnostic Decoding (MOAD) that extracts geometric and semantic features with a shared transformer decoder regardless of input modalities and provides promising improvement with a single modality as well as multi-modality. Additionally, our Proximity-based Modality Ensemble (PME) module adaptively utilizes the strengths of each modality depending on the environment while mitigating the effects of a noisy sensor. Our MEFormer achieves state-of-the-art performance of 73.9% NDS and 71.5% mAP in the nuScenes validation set. Extensive analyses validate that our MEFormer improves robustness against challenging conditions such as sensor malfunctions or environmental changes. The source code is available at https://github.com/hanchaa/MEFormer △ Less

Submitted 26 July, 2024; originally announced July 2024.

arXiv:2407.16586 [pdf, other]

Very-Large-Scale GPU-Accelerated Nuclear Gradient of Time-Dependent Density Functional Theory with Tamm-Dancoff Approximation and Range-Separated Hybrid Functionals

Authors: Inkoo Kim, Daun Jeong, Leah Weisburn, Alexandra Alexiu, Troy Van Voorhis, Young Min Rhee, Won-Joon Son, Hyung-Jin Kim, Jinkyu Yim, Sungmin Kim, Yeonchoo Cho, Inkook Jang, Seungmin Lee, Dae Sin Kim

Abstract: Modern graphics processing units (GPUs) provide an unprecedented level of computing power. In this study, we present a high-performance, multi-GPU implementation of the analytical nuclear gradient for Kohn-Sham time-dependent density functional theory (TDDFT), employing the Tamm-Dancoff approximation (TDA) and Gaussian-type atomic orbitals as basis functions. We discuss GPU-efficient algorithms fo… ▽ More Modern graphics processing units (GPUs) provide an unprecedented level of computing power. In this study, we present a high-performance, multi-GPU implementation of the analytical nuclear gradient for Kohn-Sham time-dependent density functional theory (TDDFT), employing the Tamm-Dancoff approximation (TDA) and Gaussian-type atomic orbitals as basis functions. We discuss GPU-efficient algorithms for the derivatives of electron repulsion integrals and exchange-correlation functionals within the range-separated scheme. As an illustrative example, we calculated the TDA-TDDFT gradient of the S1 state of a full-scale green fluorescent protein with explicit water solvent molecules, totaling 4353 atoms, at the wB97X/def2-SVP level of theory. Our algorithm demonstrates favorable parallel efficiencies on a high-speed distributed system equipped with 256 Nvidia A100 GPUs, achieving >70% with up to 64 GPUs and 31% with 256 GPUs, effectively leveraging the capabilities of modern high-performance computing systems. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 13 pages, 9 figures

arXiv:2407.11130 [pdf, other]

Geometric additivity of modular commutator for multipartite entanglement

Authors: Sung-Min Park, Isaac H. Kim, Eun-Gook Moon

Abstract: A recent surge of research in many-body quantum entanglement has uncovered intriguing properties of quantum many-body systems. A prime example is the modular commutator, which can extract a topological invariant from a single wave function. Here, we unveil novel geometric properties of many-body entanglement via a modular commutator of two-dimensional gapped quantum many-body systems. We obtain th… ▽ More A recent surge of research in many-body quantum entanglement has uncovered intriguing properties of quantum many-body systems. A prime example is the modular commutator, which can extract a topological invariant from a single wave function. Here, we unveil novel geometric properties of many-body entanglement via a modular commutator of two-dimensional gapped quantum many-body systems. We obtain the geometric additivity of a modular commutator, indicating that modular commutator for a multipartite system may be an integer multiple of the one for tripartite systems. Using our additivity formula, we also derive a curious identity for the modular commutators involving disconnected intervals in a certain class of conformal field theories. We further illustrate this geometric additivity for both bulk and edge subsystems using numerical calculations of the Haldane and $π$-flux models. △ Less

Submitted 25 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

Comments: 4+8 pages, 6+10 figures, v2: updated references

arXiv:2407.10515 [pdf, other]

On possible values of the signature of flat symplectic bundles over surfaces with boundary

Authors: Inkang Kim, Pierre Pansu, Xueyuan Wan

Abstract: We show that every integer in the interval $[2pχ(Σ), -2pχ(Σ)]$ is achieved by the signature of a rank $2p$ flat symplectic bundle over a surface with boundary $Σ$. When $p=1$, one can prescribe the type (elliptic, parabolic, hyperbolic) of the holonomy along the boundary. We show that every integer in the interval $[2pχ(Σ), -2pχ(Σ)]$ is achieved by the signature of a rank $2p$ flat symplectic bundle over a surface with boundary $Σ$. When $p=1$, one can prescribe the type (elliptic, parabolic, hyperbolic) of the holonomy along the boundary. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.08976 [pdf, other]

Computational-Statistical Trade-off in Kernel Two-Sample Testing with Random Fourier Features

Authors: Ikjun Choi, Ilmun Kim

Abstract: Recent years have seen a surge in methods for two-sample testing, among which the Maximum Mean Discrepancy (MMD) test has emerged as an effective tool for handling complex and high-dimensional data. Despite its success and widespread adoption, the primary limitation of the MMD test has been its quadratic-time complexity, which poses challenges for large-scale analysis. While various approaches hav… ▽ More Recent years have seen a surge in methods for two-sample testing, among which the Maximum Mean Discrepancy (MMD) test has emerged as an effective tool for handling complex and high-dimensional data. Despite its success and widespread adoption, the primary limitation of the MMD test has been its quadratic-time complexity, which poses challenges for large-scale analysis. While various approaches have been proposed to expedite the procedure, it has been unclear whether it is possible to attain the same power guarantee as the MMD test at sub-quadratic time cost. To fill this gap, we revisit the approximated MMD test using random Fourier features, and investigate its computational-statistical trade-off. We start by revealing that the approximated MMD test is pointwise consistent in power only when the number of random features approaches infinity. We then consider the uniform power of the test and study the time-power trade-off under the minimax testing framework. Our result shows that, by carefully choosing the number of random features, it is possible to attain the same minimax separation rates as the MMD test within sub-quadratic time. We demonstrate this point under different distributional assumptions such as densities in a Sobolev ball. Our theoretical findings are corroborated by simulation studies. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.08586 [pdf, other]

Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, C. Aidala, N. N. Ajitanand, Y. Akiba, R. Akimoto, H. Al-Ta'ani, J. Alexander, A. Angerami, K. Aoki, N. Apadula, Y. Aramaki, H. Asano, E. C. Aschenauer, E. T. Atomssa, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, B. Bannier, K. N. Barish, B. Bassalleck, S. Bathe , et al. (377 additional authors not shown)

Abstract: The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability… ▽ More The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 401 authors from 75 institutions, 20 pages, 15 figures, 2 tables. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2406.16695 [pdf, other]

Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling

Authors: Min-Seop Kwak, Donghoon Ahn, Ines Hyeonsu Kim, Jin-Hwa Kim, Seungryong Kim

Abstract: Score distillation sampling (SDS), the methodology in which the score from pretrained 2D diffusion models is distilled into 3D representation, has recently brought significant advancements in text-to-3D generation task. However, this approach is still confronted with critical geometric inconsistency problems such as the Janus problem. Starting from a hypothesis that such inconsistency problems may… ▽ More Score distillation sampling (SDS), the methodology in which the score from pretrained 2D diffusion models is distilled into 3D representation, has recently brought significant advancements in text-to-3D generation task. However, this approach is still confronted with critical geometric inconsistency problems such as the Janus problem. Starting from a hypothesis that such inconsistency problems may be induced by multiview inconsistencies between 2D scores predicted from various viewpoints, we introduce GSD, a simple and general plug-and-play framework for incorporating 3D consistency and therefore geometry awareness into the SDS process. Our methodology is composed of three components: 3D consistent noising, designed to produce 3D consistent noise maps that perfectly follow the standard Gaussian distribution, geometry-based gradient warping for identifying correspondences between predicted gradients of different viewpoints, and novel gradient consistency loss to optimize the scene geometry toward producing more consistent gradients. We demonstrate that our method significantly improves performance, successfully addressing the geometric inconsistency problems in text-to-3D generation task with minimal computation cost and being compatible with existing score distillation-based models. Our project page is available at https://ku-cvlab.github.io/GSD/. △ Less

Submitted 30 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.16042 [pdf, other]

Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification

Authors: Inès Hyeonsu Kim, JoungBin Lee, Soowon Son, Woojeong Jin, Kyusun Cho, Junyoung Seo, Min-Seop Kwak, Seokju Cho, JeongYeol Baek, Byeongwon Lee, Seungryong Kim

Abstract: Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data a… ▽ More Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data augmentation; however, they rely on human poses already present in the training dataset, failing to effectively reduce the human pose bias in the dataset. We propose Diff-ID, a novel data augmentation approach that incorporates sparse and underrepresented human pose and camera viewpoint examples into the training data, addressing the limited diversity in the original training data distribution. Our objective is to augment a training dataset that enables existing Re-ID models to learn features unbiased by human pose and camera viewpoint variations. To achieve this, we leverage the knowledge of pre-trained large-scale diffusion models. Using the SMPL model, we simultaneously capture both the desired human poses and camera viewpoints, enabling realistic human rendering. The depth information provided by the SMPL model indirectly conveys the camera viewpoints. By conditioning the diffusion model on both the human pose and camera viewpoint concurrently through the SMPL model, we generate realistic images with diverse human poses and camera viewpoints. Qualitative results demonstrate the effectiveness of our method in addressing human pose bias and enhancing the generalizability of Re-ID models compared to other data augmentation-based Re-ID approaches. The performance gains achieved by training Re-ID models on our offline augmented dataset highlight the potential of our proposed framework in improving the scalability and generalizability of person Re-ID models. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: The project page is available at https://ku-cvlab.github.io/Diff-ID/

arXiv:2406.13964 [pdf, other]

Hierarchical Micro-Segmentations for Zero-Trust Services via Large Language Model (LLM)-enhanced Graph Diffusion

Authors: Yinqiu Liu, Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin Shen

Abstract: In the rapidly evolving Next-Generation Networking (NGN) era, the adoption of zero-trust architectures has become increasingly crucial to protect security. However, provisioning zero-trust services in NGNs poses significant challenges, primarily due to the environmental complexity and dynamics. Motivated by these challenges, this paper explores efficient zero-trust service provisioning using hiera… ▽ More In the rapidly evolving Next-Generation Networking (NGN) era, the adoption of zero-trust architectures has become increasingly crucial to protect security. However, provisioning zero-trust services in NGNs poses significant challenges, primarily due to the environmental complexity and dynamics. Motivated by these challenges, this paper explores efficient zero-trust service provisioning using hierarchical micro-segmentations. Specifically, we model zero-trust networks via hierarchical graphs, thereby jointly considering the resource- and trust-level features to optimize service efficiency. We organize such zero-trust networks through micro-segmentations, which support granular zero-trust policies efficiently. To generate the optimal micro-segmentation, we present the Large Language Model-Enhanced Graph Diffusion (LEGD) algorithm, which leverages the diffusion process to realize a high-quality generation paradigm. Additionally, we utilize policy boosting and Large Language Models (LLM) to enable LEGD to optimize the generation policy and understand complicated graphical features. Moreover, realizing the unique trustworthiness updates or service upgrades in zero-trust NGN, we further present LEGD-Adaptive Maintenance (LEGD-AM), providing an adaptive way to perform task-oriented fine-tuning on LEGD. Extensive experiments demonstrate that the proposed LEGD achieves 90% higher efficiency in provisioning services compared with other baselines. Moreover, the LEGD-AM can reduce the service outage time by over 50%. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 13 pages

arXiv:2406.13248 [pdf, other]

Overlay Space-Air-Ground Integrated Networks with SWIPT-Empowered Aerial Communications

Authors: Anuradha Verma, Pankaj Kumar Sharma, Pawan Kumar, Dong In Kim

Abstract: In this article, we consider overlay space-air-ground integrated networks (OSAGINs) where a low earth orbit (LEO) satellite communicates with ground users (GUs) with the assistance of an energy-constrained coexisting air-to-air (A2A) network. Particularly, a non-linear energy harvester with a hybrid SWIPT utilizing both power-splitting and time-switching energy harvesting (EH) techniques is employ… ▽ More In this article, we consider overlay space-air-ground integrated networks (OSAGINs) where a low earth orbit (LEO) satellite communicates with ground users (GUs) with the assistance of an energy-constrained coexisting air-to-air (A2A) network. Particularly, a non-linear energy harvester with a hybrid SWIPT utilizing both power-splitting and time-switching energy harvesting (EH) techniques is employed at the aerial transmitter. Specifically, we take the random locations of the satellite, ground and aerial receivers to investigate the outage performance of both the satellite-to-ground and aerial networks leveraging the stochastic tools. By taking into account the Shadowed-Rician fading for satellite link, the Nakagami-\emph{m} for ground link, and the Rician fading for aerial link, we derive analytical expressions for the outage probability of these networks. For a comprehensive analysis of aerial network, we consider both the perfect and imperfect successive interference cancellation (SIC) scenarios. Through our analysis, we illustrate that, unlike linear EH, the implementation of non-linear EH provides accurate figures for any target rate, underscoring the significance of using non-linear EH models. Additionally, the influence of key parameters is emphasized, providing guidelines for the practical design of an energy-efficient as well as spectrum-efficient future non-terrestrial networks. Monte Carlo simulations validate the accuracy of our theoretical developments. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 36 pages, 14 figures, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2406.11574 [pdf, ps, other]

Non-unitary Coupled Cluster Enabled by Mid-circuit Measurements on Quantum Computers

Authors: Alexandre Fleury, James Brown, Erika Lloyd, Maritza Hernandez, Isaac H. Kim

Abstract: Many quantum algorithms rely on a quality initial state for optimal performance. Preparing an initial state for specific applications can considerably reduce the cost of probabilistic algorithms such as the well studied quantum phase estimation (QPE). Fortunately, in the application space of quantum chemistry, generating approximate wave functions for molecular systems is well studied, and quantum… ▽ More Many quantum algorithms rely on a quality initial state for optimal performance. Preparing an initial state for specific applications can considerably reduce the cost of probabilistic algorithms such as the well studied quantum phase estimation (QPE). Fortunately, in the application space of quantum chemistry, generating approximate wave functions for molecular systems is well studied, and quantum computing algorithms stand to benefit from importing these classical methods directly into a quantum circuit. In this work, we propose a state preparation method based on coupled cluster (CC) theory, which is a pillar of quantum chemistry on classical computers, by incorporating mid-circuit measurements into the circuit construction. Currently, the most well studied state preparation method for quantum chemistry on quantum computers is the variational quantum eigensolver (VQE) with a unitary-CC with single- and double-electron excitation terms (UCCSD) ansatz whose operations are limited to unitary gates. We verify the accuracy of our state preparation protocol using mid-circuit measurements by performing energy evaluation and state overlap computation for a set of small chemical systems. We further demonstrate that our approach leads to a reduction of the classical computation overhead, and the number of CNOT and T gates by 28% and 57% on average when compared against the standard VQE-UCCSD protocol. △ Less

Submitted 28 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: 26 pages, 6 figures; title changed, references added

arXiv:2406.08301 [pdf, other]

Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2406.07907 [pdf, ps, other]

Wall-crossing for K-moduli spaces of certain families of weighted projective hypersurfaces

Authors: In-Kyun Kim, Yuchen Liu, Chengxi Wang

Abstract: We describe the K-moduli spaces of weighted hypersurfaces of degree $2(n+3)$ in $\mathbb{P}(1,2,n+2,n+3)$. We show that the K-polystable limits of these weighted hypersurfaces are also weighted hypersurfaces of the same degree in the same weighted projective space. This is achieved by an explicit study of the wall crossing for K-moduli spaces $M_w$ of certain log Fano pairs with coefficient $w$ wh… ▽ More We describe the K-moduli spaces of weighted hypersurfaces of degree $2(n+3)$ in $\mathbb{P}(1,2,n+2,n+3)$. We show that the K-polystable limits of these weighted hypersurfaces are also weighted hypersurfaces of the same degree in the same weighted projective space. This is achieved by an explicit study of the wall crossing for K-moduli spaces $M_w$ of certain log Fano pairs with coefficient $w$ whose double cover gives the weighted hypersurface. Moreover, we show that the wall crossing of $M_w$ coincides with variation of GIT except at the last K-moduli wall which gives a divisorial contraction. Our K-moduli spaces provide new birational models for some natural loci in the moduli space of marked hyperelliptic curves. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 52 pages

arXiv:2406.06242 [pdf, ps, other]

Tjurina spectrum and Hertling conjecture

Authors: Seung-Jo Jung, In-Kyun Kim, Morihiko Saito, Youngho Yoon

Abstract: We present a proof of a conjecture of Q. Shi, Y. Wang and H. Zuo claiming that the maximal spectral number of a hypersurface isolated singularity does not belong to the Tjurina spectrum. This follows from the self-duality of the Jacobian ring, which is compatible with the action of $f$ and also with the $V$-filtration. We also provide a sufficient condition for the generalized Hertling conjecture… ▽ More We present a proof of a conjecture of Q. Shi, Y. Wang and H. Zuo claiming that the maximal spectral number of a hypersurface isolated singularity does not belong to the Tjurina spectrum. This follows from the self-duality of the Jacobian ring, which is compatible with the action of $f$ and also with the $V$-filtration. We also provide a sufficient condition for the generalized Hertling conjecture on the variance of Tjurina spectrum to fail, and calculate some examples using some codes in Singular. △ Less

Submitted 12 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.04772 [pdf, other]

REP: Resource-Efficient Prompting for On-device Continual Learning

Authors: Sungho Jeon, Xinyue Ma, Kwang In Kim, Myeongjae Jeon

Abstract: On-device continual learning (CL) requires the co-optimization of model accuracy and resource efficiency to be practical. This is extremely challenging because it must preserve accuracy while learning new tasks with continuously drifting data and maintain both high energy and memory efficiency to be deployable on real-world devices. Typically, a CL method leverages one of two types of backbone net… ▽ More On-device continual learning (CL) requires the co-optimization of model accuracy and resource efficiency to be practical. This is extremely challenging because it must preserve accuracy while learning new tasks with continuously drifting data and maintain both high energy and memory efficiency to be deployable on real-world devices. Typically, a CL method leverages one of two types of backbone networks: CNN or ViT. It is commonly believed that CNN-based CL excels in resource efficiency, whereas ViT-based CL is superior in model performance, making each option attractive only for a single aspect. In this paper, we revisit this comparison while embracing powerful pre-trained ViT models of various sizes, including ViT-Ti (5.8M parameters). Our detailed analysis reveals that many practical options exist today for making ViT-based methods more suitable for on-device CL, even when accuracy, energy, and memory are all considered. To further expand this impact, we introduce REP, which improves resource efficiency specifically targeting prompt-based rehearsal-free methods. Our key focus is on avoiding catastrophic trade-offs with accuracy while trimming computational and memory costs throughout the training process. We achieve this by exploiting swift prompt selection that enhances input data using a carefully provisioned model, and by developing two novel algorithms-adaptive token merging (AToM) and adaptive layer dropping (ALD)-that optimize the prompt updating stage. In particular, AToM and ALD perform selective skipping across the data and model-layer dimensions without compromising task-specific features in vision transformer models. Extensive experiments on three image classification datasets validate REP's superior resource efficiency over current state-of-the-art methods. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 19 pages, 10 figures

arXiv:2406.01963 [pdf]

Diamond molecular balance: Revolutionizing high-resolution mass spectrometry from MDa to TDa at room temperature

Authors: Donggeun Lee, Seung-Woo Jeon, Chang-Hwan Yi, Yang-Hee Kim, Yeeun Choi, Sang-Hun Lee, Jinwoong Cha, Seung-Bo Shim, Junho Suh, Il-Young Kim, Dongyeon Daniel Kang, Hojoong Jung, Cherlhyun Jeong, Jae-pyoung Ahn, Hee Chul Park, Sang-Wook Han, Chulki Kim

Abstract: The significance of mass spectrometry lies in its unparalleled ability to accurately identify and quantify molecules in complex samples, providing invaluable insights into molecular structures and interactions. Here, we leverage diamond nanostructures as highly sensitive mass sensors by utilizing a self-excitation mechanism under an electron beam in a conventional scanning electron microscope (SEM… ▽ More The significance of mass spectrometry lies in its unparalleled ability to accurately identify and quantify molecules in complex samples, providing invaluable insights into molecular structures and interactions. Here, we leverage diamond nanostructures as highly sensitive mass sensors by utilizing a self-excitation mechanism under an electron beam in a conventional scanning electron microscope (SEM). The diamond molecular balance (DMB) exhibits an exceptional mass resolution of 0.36 MDa, based on its outstanding mechanical quality factor and frequency stability, along with an extensive dynamic range from MDa to TDa. This positions the DMB at the forefront of molecular balances operating at room temperature. Notably, the DMB demonstrates its ability to measure the mass of a single bacteriophage T4 by precisely locating the analyte on the device. These findings highlight the groundbreaking potential of the DMB as a revolutionary tool for mass spectrometry at room temperature. △ Less

Submitted 25 July, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

Comments: 16 pages, 4 figures

arXiv:2406.00945 [pdf, other]

General relativistic self-gravitating equilibrium disks around rotating neutron stars

Authors: Yoonsoo Kim, Jinho Kim, Hee Il Kim, Hyung Mok Lee

Abstract: In modeling a relativistic disk around a compact object, the self-gravity of the disk is often neglected while it needs to be incorporated for more accurate descriptions in several circumstances. Extending the Komatsu-Eriguchi-Hachisu self-consistent field method, we present numerical models of a rapidly rotating neutron star with a self-gravitating disk in stationary equilibrium. In particular, o… ▽ More In modeling a relativistic disk around a compact object, the self-gravity of the disk is often neglected while it needs to be incorporated for more accurate descriptions in several circumstances. Extending the Komatsu-Eriguchi-Hachisu self-consistent field method, we present numerical models of a rapidly rotating neutron star with a self-gravitating disk in stationary equilibrium. In particular, our approach allows us to obtain numerical solutions involving a massive disk with the rest mass $O(10^{-1})-O(10^0) M_\odot$ closely attached to a rotating neutron star. We also assess the impact of self-gravity on the internal structure of the disk and the neutron star. These axisymmetric, stationary solutions can be employed for simulations involving the neutron star-disk system in the context of high-energy transients and gravitational wave emissions. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 15 pages, 12 figures

arXiv:2405.19912 [pdf, other]

Robust Kernel Hypothesis Testing under Data Corruption

Authors: Antonin Schrab, Ilmun Kim

Abstract: We propose two general methods for constructing robust permutation tests under data corruption. The proposed tests effectively control the non-asymptotic type I error under data corruption, and we prove their consistency in power under minimal conditions. This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks. One of our meth… ▽ More We propose two general methods for constructing robust permutation tests under data corruption. The proposed tests effectively control the non-asymptotic type I error under data corruption, and we prove their consistency in power under minimal conditions. This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks. One of our methods inherently ensures differential privacy, further broadening its applicability to private data analysis. For the two-sample and independence settings, we show that our kernel robust tests are minimax optimal, in the sense that they are guaranteed to be non-asymptotically powerful against alternatives uniformly separated from the null in the kernel MMD and HSIC metrics at some optimal rate (tight with matching lower bound). Finally, we provide publicly available implementations and empirically illustrate the practicality of our proposed tests. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 26 pages, 2 figures, 2 algorithms

arXiv:2405.19704 [pdf, other]

Enhancing Sufficient Dimension Reduction via Hellinger Correlation

Authors: Seungbeom Hong, Ilmun Kim, Jun Song

Abstract: In this work, we develop a new theory and method for sufficient dimension reduction (SDR) in single-index models, where SDR is a sub-field of supervised dimension reduction based on conditional independence. Our work is primarily motivated by the recent introduction of the Hellinger correlation as a dependency measure. Utilizing this measure, we develop a method capable of effectively detecting th… ▽ More In this work, we develop a new theory and method for sufficient dimension reduction (SDR) in single-index models, where SDR is a sub-field of supervised dimension reduction based on conditional independence. Our work is primarily motivated by the recent introduction of the Hellinger correlation as a dependency measure. Utilizing this measure, we develop a method capable of effectively detecting the dimension reduction subspace, complete with theoretical justification. Through extensive numerical experiments, we demonstrate that our proposed method significantly enhances and outperforms existing SDR methods. This improvement is largely attributed to our proposed method's deeper understanding of data dependencies and the refinement of existing SDR techniques. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.17379 [pdf, other]

Classifying 2D topological phases: mapping ground states to string-nets

Authors: Isaac H. Kim, Daniel Ranard

Abstract: We prove the conjectured classification of topological phases in two spatial dimensions with gappable boundary, in a simplified setting. Two gapped ground states of lattice Hamiltonians are in the same quantum phase of matter, or topological phase, if they can be connected by a constant-depth quantum circuit. It is conjectured that the Levin-Wen string-net models exhaust all possible gapped phases… ▽ More We prove the conjectured classification of topological phases in two spatial dimensions with gappable boundary, in a simplified setting. Two gapped ground states of lattice Hamiltonians are in the same quantum phase of matter, or topological phase, if they can be connected by a constant-depth quantum circuit. It is conjectured that the Levin-Wen string-net models exhaust all possible gapped phases with gappable boundary, and these phases are labeled by unitary modular tensor categories. We prove this under the assumption that every phase has a representative state with zero correlation length satisfying the entanglement bootstrap axioms, or a strict form of area law. Our main technical development is to transform these states into string-net states using constant-depth quantum circuits. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 48 pages, many figures

arXiv:2405.15714 [pdf, ps, other]

Mean Field Limit for Congestion Dynamics in One Dimension

Authors: Inwon Kim, Antoine Mellet, Jeremy Sheung-Him Wu

Abstract: This paper addresses congested transport, which can be described, at macroscopic scales, by a continuity equation with a pressure variable generated from the hard-congestion constraint (maximum value of the density). The main goal of the paper is to show that, in one spatial dimension, this continuum PDE can be derived as the mean-field limit of a system of ordinary differential equations that des… ▽ More This paper addresses congested transport, which can be described, at macroscopic scales, by a continuity equation with a pressure variable generated from the hard-congestion constraint (maximum value of the density). The main goal of the paper is to show that, in one spatial dimension, this continuum PDE can be derived as the mean-field limit of a system of ordinary differential equations that describes the motion of a large number of particles constrained to stay at some finite distance from each others. To show that these two models describe the same dynamics at different scale, we will rely on both the Eulerian and Lagrangian points of view and use two different approximations for the density and pressure variables in the continuum limit. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 23 pages, 1 figure

MSC Class: 35Q70

arXiv:2405.14155 [pdf]

Room-temperature waveguide-integrated photodetector using bolometric effect for mid-infrared spectroscopy applications

Authors: Joonsup Shim, Jinha Lim, Inki Kim, Jaeyong Jeong, Bong Ho Kim, Seong Kwang Kim, Dae-Myeong Geum, SangHyeon Kim

Abstract: Waveguide-integrated mid-infrared (MIR) photodetectors are pivotal components for developing molecular spectroscopy applications, leveraging mature photonic integrated circuit (PIC) technologies. Despite various strategies, critical challenges still remain in achieving broadband photoresponse, cooling-free operation, and large-scale complementary-metal-oxide-semiconductor (CMOS)-compatible manufac… ▽ More Waveguide-integrated mid-infrared (MIR) photodetectors are pivotal components for developing molecular spectroscopy applications, leveraging mature photonic integrated circuit (PIC) technologies. Despite various strategies, critical challenges still remain in achieving broadband photoresponse, cooling-free operation, and large-scale complementary-metal-oxide-semiconductor (CMOS)-compatible manufacturability. To leap beyond these limitations, the bolometric effect - a thermal detection mechanism - is introduced into the waveguide platform. More importantly, we pursue a free-carrier absorption (FCA) process in germanium (Ge) to create an efficient light-absorbing medium, providing a pragmatic solution for full coverage of the MIR spectrum without incorporating exotic materials into CMOS. Here, we present an uncooled waveguide-integrated photodetector based on a Ge-on-insulator (Ge-OI) PIC architecture, exploiting the bolometric effect combined with FCA. Notably, our device exhibits a broadband responsivity of ~12 mA/W across 4030-4360 nm (and potentially beyond), challenging the state of the art, while achieving a noise-equivalent power of 3.4x10^-9 W/Hz^0.5 at 4180 nm. We further demonstrate label-free sensing of carbon dioxide using our integrated photodetector and sensing waveguide on a single chip. This approach to room-temperature waveguide-integrated MIR photodetection, harnessing bolometry with FCA in Ge, not only facilitates the realization of fully integrated lab-on-a-chip systems with wavelength flexibility but also provides a blueprint for MIR PICs with CMOS-foundry-compatibility. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 6 figures for the main manuscript and 14 figures for the supplementary information

arXiv:2405.12472 [pdf, ps, other]

Optimizing Generative AI Networking: A Dual Perspective with Multi-Agent Systems and Mixture of Experts

Authors: Ruichen Zhang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Ping Zhang, Dong In Kim

Abstract: In the continued development of next-generation networking and artificial intelligence content generation (AIGC) services, the integration of multi-agent systems (MAS) and the mixture of experts (MoE) frameworks is becoming increasingly important. Motivated by this, this article studies the contrasting and converging of MAS and MoE in AIGC-enabled networking. First, we discuss the architectural de… ▽ More In the continued development of next-generation networking and artificial intelligence content generation (AIGC) services, the integration of multi-agent systems (MAS) and the mixture of experts (MoE) frameworks is becoming increasingly important. Motivated by this, this article studies the contrasting and converging of MAS and MoE in AIGC-enabled networking. First, we discuss the architectural designs, operational procedures, and inherent advantages of using MAS and MoE in generative AI to explore its functionality and applications fully. Next, we review the applications of MAS and MoE frameworks in content generation and resource allocation, emphasizing their impact on networking operations. Subsequently, we propose a novel multi-agent-enabled MoE-proximal policy optimization (MoE-PPO) framework for 3D object generation and data transfer scenarios. The framework uses MAS for dynamic task coordination of each network service provider agent and MoE for expert-driven execution of respective tasks, thereby improving overall system efficiency and adaptability. The simulation results demonstrate the effectiveness of our proposed framework and significantly improve the performance indicators under different network conditions. Finally, we outline potential future research directions. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 9 pages, 4 figures

arXiv:2405.10272 [pdf, other]

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Authors: Youngjoon Jang, Ji-Hoon Kim, Junseok Ahn, Doyeop Kwak, Hong-Sun Yang, Yoon-Cheol Ju, Il-Hwan Kim, Byeong-Yeol Kim, Joon Son Chung

Abstract: The goal of this work is to simultaneously generate natural talking faces and speech outputs from text. We achieve this by integrating Talking Face Generation (TFG) and Text-to-Speech (TTS) systems into a unified framework. We address the main challenges of each task: (1) generating a range of head poses representative of real-world scenarios, and (2) ensuring voice consistency despite variations… ▽ More The goal of this work is to simultaneously generate natural talking faces and speech outputs from text. We achieve this by integrating Talking Face Generation (TFG) and Text-to-Speech (TTS) systems into a unified framework. We address the main challenges of each task: (1) generating a range of head poses representative of real-world scenarios, and (2) ensuring voice consistency despite variations in facial motion for the same identity. To tackle these issues, we introduce a motion sampler based on conditional flow matching, which is capable of high-quality motion code generation in an efficient way. Moreover, we introduce a novel conditioning method for the TTS system, which utilises motion-removed features from the TFG model to yield uniform speech outputs. Our extensive experiments demonstrate that our method effectively creates natural-looking talking faces and speech that accurately match the input text. To our knowledge, this is the first effort to build a multimodal synthesis system that can generalise to unseen identities. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: CVPR 2024

arXiv:2405.04907 [pdf, other]

Empowering Wireless Networks with Artificial Intelligence Generated Graph

Authors: Jiacheng Wang, Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Haibo Zhou, Dong In Kim

Abstract: In wireless communications, transforming network into graphs and processing them using deep learning models, such as Graph Neural Networks (GNNs), is one of the mainstream network optimization approaches. While effective, the generative AI (GAI) shows stronger capabilities in graph analysis, processing, and generation, than conventional methods such as GNN, offering a broader exploration space for… ▽ More In wireless communications, transforming network into graphs and processing them using deep learning models, such as Graph Neural Networks (GNNs), is one of the mainstream network optimization approaches. While effective, the generative AI (GAI) shows stronger capabilities in graph analysis, processing, and generation, than conventional methods such as GNN, offering a broader exploration space for graph-based network optimization. Therefore, this article proposes to use GAI-based graph generation to support wireless networks. Specifically, we first explore applications of graphs in wireless networks. Then, we introduce and analyze common GAI models from the perspective of graph generation. On this basis, we propose a framework that incorporates the conditional diffusion model and an evaluation network, which can be trained with reward functions and conditions customized by network designers and users. Once trained, the proposed framework can create graphs based on new conditions, helping to tackle problems specified by the user in wireless networks. Finally, using the link selection in integrated sensing and communication (ISAC) as an example, the effectiveness of the proposed framework is validated. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.04198 [pdf, other]

Enhancing Physical Layer Communication Security through Generative AI with Mixture of Experts

Authors: Changyuan Zhao, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin, Shen, Khaled B. Letaief

Abstract: AI technologies have become more widely adopted in wireless communications. As an emerging type of AI technologies, the generative artificial intelligence (GAI) gains lots of attention in communication security. Due to its powerful learning ability, GAI models have demonstrated superiority over conventional AI methods. However, GAI still has several limitations, including high computational comple… ▽ More AI technologies have become more widely adopted in wireless communications. As an emerging type of AI technologies, the generative artificial intelligence (GAI) gains lots of attention in communication security. Due to its powerful learning ability, GAI models have demonstrated superiority over conventional AI methods. However, GAI still has several limitations, including high computational complexity and limited adaptability. Mixture of Experts (MoE), which uses multiple expert models for prediction through a gate mechanism, proposes possible solutions. Firstly, we review GAI model's applications in physical layer communication security, discuss limitations, and explore how MoE can help GAI overcome these limitations. Furthermore, we propose an MoE-enabled GAI framework for network optimization problems for communication security. To demonstrate the framework's effectiveness, we provide a case study in a cooperative friendly jamming scenario. The experimental results show that the MoE-enabled framework effectively assists the GAI algorithm, solves its limitations, and enhances communication security. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 9 pages, 4 figures

arXiv:2404.18705 [pdf, other]

Wireless Information and Energy Transfer in the Era of 6G Communications

Authors: Constantinos Psomas, Konstantinos Ntougias, Nikita Shanin, Dongfang Xu, Kenneth MacSporran Mayer, Nguyen Minh Tran, Laura Cottatellucci, Kae Won Choi, Dong In Kim, Robert Schober, Ioannis Krikidis

Abstract: Wireless information and energy transfer (WIET) represents an emerging paradigm which employs controllable transmission of radio-frequency signals for the dual purpose of data communication and wireless charging. As such, WIET is widely regarded as an enabler of envisioned 6G use cases that rely on energy-sustainable Internet-of-Things (IoT) networks, such as smart cities and smart grids. Meeting… ▽ More Wireless information and energy transfer (WIET) represents an emerging paradigm which employs controllable transmission of radio-frequency signals for the dual purpose of data communication and wireless charging. As such, WIET is widely regarded as an enabler of envisioned 6G use cases that rely on energy-sustainable Internet-of-Things (IoT) networks, such as smart cities and smart grids. Meeting the quality-of-service demands of WIET, in terms of both data transfer and power delivery, requires effective co-design of the information and energy signals. In this article, we present the main principles and design aspects of WIET, focusing on its integration in 6G networks. First, we discuss how conventional communication notions such as resource allocation and waveform design need to be revisited in the context of WIET. Next, we consider various candidate 6G technologies that can boost WIET efficiency, namely, holographic multiple-input multiple-output, near-field beamforming, terahertz communication, intelligent reflecting surfaces (IRSs), and reconfigurable (fluid) antenna arrays. We introduce respective WIET design methods, analyze the promising performance gains of these WIET systems, and discuss challenges, open issues, and future research directions. Finally, a near-field energy beamforming scheme and a power-based IRS beamforming algorithm are experimentally validated using a wireless energy transfer testbed. The vision of WIET in communication systems has been gaining momentum in recent years, with constant progress with respect to theoretical but also practical aspects. The comprehensive overview of the state of the art of WIET presented in this paper highlights the potentials of WIET systems as well as their overall benefits in 6G networks. △ Less

Submitted 16 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: Proceedings of the IEEE, 36 pages, 33 figures

arXiv:2404.16356 [pdf, other]

Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey

Authors: Minrui Xu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Yuguang Fang, Dong In Kim, Xuemin, Shen

Abstract: Generative AI (GAI) can enhance the cognitive, reasoning, and planning capabilities of intelligent modules in the Internet of Vehicles (IoV) by synthesizing augmented datasets, completing sensor data, and making sequential decisions. In addition, the mixture of experts (MoE) can enable the distributed and collaborative execution of AI models without performance degradation between connected vehicl… ▽ More Generative AI (GAI) can enhance the cognitive, reasoning, and planning capabilities of intelligent modules in the Internet of Vehicles (IoV) by synthesizing augmented datasets, completing sensor data, and making sequential decisions. In addition, the mixture of experts (MoE) can enable the distributed and collaborative execution of AI models without performance degradation between connected vehicles. In this survey, we explore the integration of MoE and GAI to enable Artificial General Intelligence in IoV, which can enable the realization of full autonomy for IoV with minimal human supervision and applicability in a wide range of mobility scenarios, including environment monitoring, traffic management, and autonomous driving. In particular, we present the fundamentals of GAI, MoE, and their interplay applications in IoV. Furthermore, we discuss the potential integration of MoE and GAI in IoV, including distributed perception and monitoring, collaborative decision-making and planning, and generative modeling and simulation. Finally, we present several potential research directions for facilitating the integration. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.14140 [pdf, other]

Generative Artificial Intelligence Assisted Wireless Sensing: Human Flow Detection in Practical Communication Environments

Authors: Jiacheng Wang, Hongyang Du, Dusit Niyato, Zehui Xiong, Jiawen Kang, Bo Ai, Zhu Han, Dong In Kim

Abstract: Groundbreaking applications such as ChatGPT have heightened research interest in generative artificial intelligence (GAI). Essentially, GAI excels not only in content generation but also in signal processing, offering support for wireless sensing. Hence, we introduce a novel GAI-assisted human flow detection system (G-HFD). Rigorously, G-HFD first uses channel state information (CSI) to estimate t… ▽ More Groundbreaking applications such as ChatGPT have heightened research interest in generative artificial intelligence (GAI). Essentially, GAI excels not only in content generation but also in signal processing, offering support for wireless sensing. Hence, we introduce a novel GAI-assisted human flow detection system (G-HFD). Rigorously, G-HFD first uses channel state information (CSI) to estimate the velocity and acceleration of propagation path length change of the human-induced reflection (HIR). Then, given the strong inference ability of the diffusion model, we propose a unified weighted conditional diffusion model (UW-CDM) to denoise the estimation results, enabling the detection of the number of targets. Next, we use the CSI obtained by a uniform linear array with wavelength spacing to estimate the HIR's time of flight and direction of arrival (DoA). In this process, UW-CDM solves the problem of ambiguous DoA spectrum, ensuring accurate DoA estimation. Finally, through clustering, G-HFD determines the number of subflows and the number of targets in each subflow, i.e., the subflow size. The evaluation based on practical downlink communication signals shows G-HFD's accuracy of subflow size detection can reach 91%. This validates its effectiveness and underscores the significant potential of GAI in the context of wireless sensing. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.12168 [pdf, other]

Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization

Authors: Insoo Kim, Jae Seok Choi, Geonseok Seo, Kinam Kwon, Jinwoo Shin, Hyong-Euk Lee

Abstract: As recent advances in mobile camera technology have enabled the capability to capture high-resolution images, such as 4K images, the demand for an efficient deblurring model handling large motion has increased. In this paper, we discover that the image residual errors, i.e., blur-sharp pixel differences, can be grouped into some categories according to their motion blur type and how complex their… ▽ More As recent advances in mobile camera technology have enabled the capability to capture high-resolution images, such as 4K images, the demand for an efficient deblurring model handling large motion has increased. In this paper, we discover that the image residual errors, i.e., blur-sharp pixel differences, can be grouped into some categories according to their motion blur type and how complex their neighboring pixels are. Inspired by this, we decompose the deblurring (regression) task into blur pixel discretization (pixel-level blur classification) and discrete-to-continuous conversion (regression with blur class map) tasks. Specifically, we generate the discretized image residual errors by identifying the blur pixels and then transform them to a continuous form, which is computationally more efficient than naively solving the original regression problem with continuous values. Here, we found that the discretization result, i.e., blur segmentation map, remarkably exhibits visual similarity with the image residual errors. As a result, our efficient model shows comparable performance to state-of-the-art methods in realistic benchmarks, while our method is up to 10 times computationally more efficient. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: CVPR2024 Camera-Ready

arXiv:2404.10050 [pdf, other]

The Cost of Entanglement Renormalization on a Fault-Tolerant Quantum Computer

Authors: Joshua Job, Isaac H. Kim, Eric Johnston, Steve Adachi

Abstract: We perform a detailed resource estimate for the prospect of using deep entanglement renormalization ansatz (DMERA) on a fault-tolerant quantum computer, focusing on the regime in which the target system is large. For probing a relatively large system size ($64\times 64$), we observe up to an order of magnitude reduction in the number of qubits, compared to the approaches based on quantum phase est… ▽ More We perform a detailed resource estimate for the prospect of using deep entanglement renormalization ansatz (DMERA) on a fault-tolerant quantum computer, focusing on the regime in which the target system is large. For probing a relatively large system size ($64\times 64$), we observe up to an order of magnitude reduction in the number of qubits, compared to the approaches based on quantum phase estimation (QPE). We discuss two complementary strategies to measure the energy. The first approach is based on a random sampling of the local terms of the Hamiltonian, requiring $\mathcal{O}(1/ε^2)$ invocations of quantum circuits, each of which have depth of at most $\mathcal{O}(\log N)$, where $ε$ is the relative precision in the energy and $N$ is the system size. The second approach is based on a coherent estimation of the expectation value of observables averaged over space, which achieves the Heisenberg scaling while incurring only a logarithmic cost in the system size. For estimating the energy per site of $ε$, $\mathcal{O}\left(\frac{\log N}ε \right)$ $T$ gates and $\mathcal{O}\left(\log N \right)$ qubits suffice. The constant factor of the leading contribution is shown to be determined by the depth of the DMERA circuit, the gates used in the ansatz, and the periodicity of the circuit. We also derive tight bounds on the variance of the energy gradient, assuming the gates are random Pauli rotations. △ Less

Submitted 16 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

Comments: 21 pages. 12 figures, 2 appendices

arXiv:2404.09134 [pdf, ps, other]

Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission

Authors: Ruichen Zhang, Hongyang Du, Yinqiu Liu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Dong In Kim

Abstract: In response to the needs of 6G global communications, satellite communication networks have emerged as a key solution. However, the large-scale development of satellite communication networks is constrained by the complex system models, whose modeling is challenging for massive users. Moreover, transmission interference between satellites and users seriously affects communication performance. To s… ▽ More In response to the needs of 6G global communications, satellite communication networks have emerged as a key solution. However, the large-scale development of satellite communication networks is constrained by the complex system models, whose modeling is challenging for massive users. Moreover, transmission interference between satellites and users seriously affects communication performance. To solve these problems, this paper develops generative artificial intelligence (AI) agents for model formulation and then applies a mixture of experts (MoE) approach to design transmission strategies. Specifically, we leverage large language models (LLMs) to build an interactive modeling paradigm and utilize retrieval-augmented generation (RAG) to extract satellite expert knowledge that supports mathematical modeling. Afterward, by integrating the expertise of multiple specialized components, we propose an MoE-proximal policy optimization (PPO) approach to solve the formulated problem. Each expert can optimize the optimization variables at which it excels through specialized training through its own network and then aggregates them through the gating network to perform joint optimization. The simulation results validate the accuracy and effectiveness of employing a generative agent for problem formulation. Furthermore, the superiority of the proposed MoE-ppo approach over other benchmarks is confirmed in solving the formulated problem. The adaptability of MoE-PPO to various customized modeling problems has also been demonstrated. △ Less

Submitted 29 June, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

Comments: 15 pages, 10 figures

arXiv:2404.08396 [pdf, other]

Joint Computation Offloading and Target Tracking in Integrated Sensing and Communication Enabled UAV Networks

Authors: Trinh Van Chien, Mai Dinh Cong, Nguyen Cong Luong, Tri Nhu Do, Dong In Kim, Symeon Chatzinotas

Abstract: In this paper, we investigate a joint computation offloading and target tracking in Integrated Sensing and Communication (ISAC)-enabled unmanned aerial vehicle (UAV) network. Therein, the UAV has a computing task that is partially offloaded to the ground UE for execution. Meanwhile, the UAV uses the offloading bit sequence to estimate the velocity of a ground target based on an autocorrelation fun… ▽ More In this paper, we investigate a joint computation offloading and target tracking in Integrated Sensing and Communication (ISAC)-enabled unmanned aerial vehicle (UAV) network. Therein, the UAV has a computing task that is partially offloaded to the ground UE for execution. Meanwhile, the UAV uses the offloading bit sequence to estimate the velocity of a ground target based on an autocorrelation function. The performance of the velocity estimation that is represented by Cramer-Rao lower bound (CRB) depends on the length of the offloading bit sequence and the UAV's location. Thus, we jointly optimize the task size for offloading and the UAV's location to minimize the overall computation latency and the CRB of the mean square error for velocity estimation subject to the UAV's budget. The problem is non-convex, and we propose a genetic algorithm to solve it. Simulation results are provided to demonstrate the effectiveness of the proposed algorithm. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 5 pages, 3 figures, 1 table. Accepted by IEEE Communications Letters

arXiv:2404.05867 [pdf, other]

Strict area law implies commuting parent Hamiltonian

Authors: Isaac H. Kim, Ting-Chun Lin, Daniel Ranard, Bowen Shi

Abstract: We show that in two spatial dimensions, when a quantum state has entanglement entropy obeying a strict area law, meaning $S(A)=α|\partial A| - γ$ for constants $α, γ$ independent of lattice region $A$, then it admits a commuting parent Hamiltonian. More generally, we prove that the entanglement bootstrap axioms in 2D imply the existence of a commuting, local parent Hamiltonian with a stable spectr… ▽ More We show that in two spatial dimensions, when a quantum state has entanglement entropy obeying a strict area law, meaning $S(A)=α|\partial A| - γ$ for constants $α, γ$ independent of lattice region $A$, then it admits a commuting parent Hamiltonian. More generally, we prove that the entanglement bootstrap axioms in 2D imply the existence of a commuting, local parent Hamiltonian with a stable spectral gap. We also extend our proof to states that describe gapped domain walls. Physically, these results imply that the states studied in the entanglement bootstrap program correspond to ground states of some local Hamiltonian, describing a stable phase of matter. Our result also suggests that systems with chiral gapless edge modes cannot obey a strict area law provided they have finite local Hilbert space. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 19+2 pages, 10 figures

arXiv:2404.03725 [pdf, other]

Conformal geometry from entanglement

Authors: Isaac H. Kim, Xiang Li, Ting-Chun Lin, John McGreevy, Bowen Shi

Abstract: In a physical system with conformal symmetry, observables depend on cross-ratios, measures of distance invariant under global conformal transformations (conformal geometry for short). We identify a quantum information-theoretic mechanism by which the conformal geometry emerges at the gapless edge of a 2+1D quantum many-body system with a bulk energy gap. We introduce a novel pair of information-th… ▽ More In a physical system with conformal symmetry, observables depend on cross-ratios, measures of distance invariant under global conformal transformations (conformal geometry for short). We identify a quantum information-theoretic mechanism by which the conformal geometry emerges at the gapless edge of a 2+1D quantum many-body system with a bulk energy gap. We introduce a novel pair of information-theoretic quantities $(\mathfrak{c}_{\mathrm{tot}}, η)$ that can be defined locally on the edge from the wavefunction of the many-body system, without prior knowledge of any distance measure. We posit that, for a topological groundstate, the quantity $\mathfrak{c}_{\mathrm{tot}}$ is stationary under arbitrary variations of the quantum state, and study the logical consequences. We show that stationarity, modulo an entanglement-based assumption about the bulk, implies (i) $\mathfrak{c}_{\mathrm{tot}}$ is a non-negative constant that can be interpreted as the total central charge of the edge theory. (ii) $η$ is a cross-ratio, obeying the full set of mathematical consistency rules, which further indicates the existence of a distance measure of the edge with global conformal invariance. Thus, the conformal geometry emerges from a simple assumption on groundstate entanglement. We show that stationarity of $\mathfrak{c}_{\mathrm{tot}}$ is equivalent to a vector fixed-point equation involving $η$, making our assumption locally checkable. We also derive similar results for 1+1D systems under a suitable set of assumptions. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 48+31 pages, 25 figures

arXiv:2404.03321 [pdf, other]

Fusion of Mixture of Experts and Generative Artificial Intelligence in Mobile Edge Metaverse

Authors: Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Shiwen Mao, Dong In Kim

Abstract: In the digital transformation era, Metaverse offers a fusion of virtual reality (VR), augmented reality (AR), and web technologies to create immersive digital experiences. However, the evolution of the Metaverse is slowed down by the challenges of content creation, scalability, and dynamic user interaction. Our study investigates an integration of Mixture of Experts (MoE) models with Generative Ar… ▽ More In the digital transformation era, Metaverse offers a fusion of virtual reality (VR), augmented reality (AR), and web technologies to create immersive digital experiences. However, the evolution of the Metaverse is slowed down by the challenges of content creation, scalability, and dynamic user interaction. Our study investigates an integration of Mixture of Experts (MoE) models with Generative Artificial Intelligence (GAI) for mobile edge computing to revolutionize content creation and interaction in the Metaverse. Specifically, we harness an MoE model's ability to efficiently manage complex data and complex tasks by dynamically selecting the most relevant experts running various sub-models to enhance the capabilities of GAI. We then present a novel framework that improves video content generation quality and consistency, and demonstrate its application through case studies. Our findings underscore the efficacy of MoE and GAI integration to redefine virtual experiences by offering a scalable, efficient pathway to harvest the Metaverse's full potential. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2404.03102 [pdf, other]

Direct Experimental Constraints on the Spatial Extent of a Neutrino Wavepacket

Authors: Joseph Smolsky, Kyle G Leach, Ryan Abells, Pedro Amaro, Adrien Andoche, Keith Borbridge, Connor Bray, Robin Cantor, David Diercks, Spencer Fretwell, Stephan Friedrich, Abigail Gillespie, Mauro Guerra, Ad Hall, Cameron N Harris, Jackson T Harris, Calvin Hinkle, Amii Lamm, Leendert M Hayen, Paul-Antoine Hervieux, Geon-Bo Kim, Inwook Kim, Annika Lennarz, Vincenzo Lordi, Jorge Machado , et al. (13 additional authors not shown)

Abstract: Despite their high relative abundance in our Universe, neutrinos are the least understood fundamental particles of nature. They also provide a unique system to study quantum coherence and the wavelike nature of particles in fundamental systems due to their extremely weak interaction probabilities. In fact, the quantum properties of neutrinos emitted in experimentally relevant sources are virtually… ▽ More Despite their high relative abundance in our Universe, neutrinos are the least understood fundamental particles of nature. They also provide a unique system to study quantum coherence and the wavelike nature of particles in fundamental systems due to their extremely weak interaction probabilities. In fact, the quantum properties of neutrinos emitted in experimentally relevant sources are virtually unknown and the spatial extent of the neutrino wavepacket is only loosely constrained by reactor neutrino oscillation data with a spread of 13 orders of magnitude. Here, we present the first direct limits of this quantity through a new experimental concept to extract the energy width, $σ_{\textrm{N},E}$, of the recoil daughter nucleus emitted in the nuclear electron capture (EC) decay of $^7$Be. The final state in the EC decay process contains a recoiling $^7$Li nucleus and an electron neutrino ($ν_e$) which are entangled at their creation. The $^7$Li energy spectrum is measured to high precision by directly embedding $^7$Be radioisotopes into a high resolution superconducting tunnel junction that is operated as a cryogenic sensor. The lower limit on the spatial uncertainty of the recoil daughter was found to be $σ_{\textrm{N}, x} \geq 6.2$\,pm, which implies the final-state system is localized at a scale more than a thousand times larger than the nucleus itself. From this measurement, the first direct lower limits on the spatial extent of the neutrino wavepacket were extracted using two different theoretical methods. These results have wide-reaching implications in several areas including the nature of spatial localization at sub-atomic scales, interpretation of neutrino physics data, and the potential reach of future large-scale experiments. △ Less

Submitted 30 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: 20 pages, 3 figures, v3 corrects and updates one of the wavepacket width calculations

arXiv:2404.01954 [pdf, other]

HyperCLOVA X Technical Report

Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs. △ Less

Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 44 pages; updated authors list and fixed author names

arXiv:2404.01583 [pdf, other]

Defining Problem from Solutions: Inverse Reinforcement Learning (IRL) and Its Applications for Next-Generation Networking

Authors: Yinqiu Liu, Ruichen Zhang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim

Abstract: Performance optimization is a critical concern in networking, on which Deep Reinforcement Learning (DRL) has achieved great success. Nonetheless, DRL training relies on precisely defined reward functions, which formulate the optimization objective and indicate the positive/negative progress towards the optimal. With the ever-increasing environmental complexity and human participation in Next-Gener… ▽ More Performance optimization is a critical concern in networking, on which Deep Reinforcement Learning (DRL) has achieved great success. Nonetheless, DRL training relies on precisely defined reward functions, which formulate the optimization objective and indicate the positive/negative progress towards the optimal. With the ever-increasing environmental complexity and human participation in Next-Generation Networking (NGN), defining appropriate reward functions become challenging. In this article, we explore the applications of Inverse Reinforcement Learning (IRL) in NGN. Particularly, if DRL aims to find optimal solutions to the problem, IRL finds a problem from the optimal solutions, where the optimal solutions are collected from experts, and the problem is defined by reward inference. Specifically, we first formally introduce the IRL technique, including its fundamentals, workflow, and difference from DRL. Afterward, we present the motivations of IRL applications in NGN and survey existing studies. Furthermore, to demonstrate the process of applying IRL in NGN, we perform a case study about human-centric prompt engineering in Generative AI-enabled networks. We demonstrate the effectiveness of using both DRL and IRL techniques and prove the superiority of IRL. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 9 pages

arXiv:2403.19968 [pdf, ps, other]

An existence and uniqueness result to evolution equations with sign-changing pseudo-differential operators and its applications to logarithmic Laplacian operators and second-order differential operators without ellipticity

Authors: Jae-Hwan Choi, Ildoo Kim

Abstract: We broaden the domain of the Fourier transform to contain all distributions without using the Paley-Wiener theorem and devise a new weak formulation built upon this extension. This formulation is applicable to evolution equations involving pseudo-differential operators, even when the signs of their symbols may vary over time. Notably, our main operator includes the logarithmic Laplacian operator… ▽ More We broaden the domain of the Fourier transform to contain all distributions without using the Paley-Wiener theorem and devise a new weak formulation built upon this extension. This formulation is applicable to evolution equations involving pseudo-differential operators, even when the signs of their symbols may vary over time. Notably, our main operator includes the logarithmic Laplacian operator $\log (-Δ)$ and a second-order differential operator whose leading coefficients are not positive semi-definite. △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: 49 pages

MSC Class: 35S05; 35S10; 35A01; 35D30; 47G30

arXiv:2403.19132 [pdf, ps, other]

Meta-Heuristic Fronthaul Bit Allocation for Cell-free Massive MIMO Systems

Authors: Minje Kim, In-soo Kim, Junil Choi

Abstract: Limited capacity of fronthaul links in a cell-free massive multiple-input multiple-output (MIMO) system can cause quantization errors at a central processing unit (CPU) during data transmission, complicating the centralized rate optimization problem. Addressing this challenge, we propose a harmony search (HS)-based algorithm that renders the combinatorial non-convex problem tractable. One of the d… ▽ More Limited capacity of fronthaul links in a cell-free massive multiple-input multiple-output (MIMO) system can cause quantization errors at a central processing unit (CPU) during data transmission, complicating the centralized rate optimization problem. Addressing this challenge, we propose a harmony search (HS)-based algorithm that renders the combinatorial non-convex problem tractable. One of the distinctive features of our algorithm is its hierarchical structure: it first allocates resources at the access point (AP) level and subsequently optimizes for user equipment (UE), ensuring a more efficient and structured approach to resource allocation. Our proposed algorithm deals with rigorous conditions, such as asymmetric fronthaul bit allocation and distinct quantization error levels at each AP, which were not considered in previous works. We derive a closed-form expression of signal-to-interference-plusnoise ratio (SINR), in which additive quantization noise model (AQNM) based distortion error is taken into account, to define the mathematical expression of spectral efficiency (SE) for each UE. Also, we provide analyses on computational complexity and convergence to investigate the practicality of proposed algorithm. By leveraging various performance metrics such as total SE and max-min fairness, we demonstrate that the proposed algorithm can adaptively optimize the fronthaul bit allocation depending on system requirements. Finally, simulation results show that the proposed algorithm can achieve satisfactory performance while maintaining low computational complexity, as compared to the exhaustive search method △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 16 pages, 13 figures, accepted to IEEE Transactions on Wireless Communications (TWC)

arXiv:2403.18410 [pdf, other]

Chiral Virasoro algebra from a single wavefunction

Authors: Isaac H. Kim, Xiang Li, Ting-Chun Lin, John McGreevy, Bowen Shi

Abstract: Chiral edges of 2+1D systems can have very robust emergent conformal symmetry. When the edge is purely chiral, the Hilbert space of low-energy edge excitations can form a representation of a single Virasoro algebra. We propose a method to systematically extract the generators of the Virasoro algebra from a single ground state wavefunction, using entanglement bootstrap and an input from the edge co… ▽ More Chiral edges of 2+1D systems can have very robust emergent conformal symmetry. When the edge is purely chiral, the Hilbert space of low-energy edge excitations can form a representation of a single Virasoro algebra. We propose a method to systematically extract the generators of the Virasoro algebra from a single ground state wavefunction, using entanglement bootstrap and an input from the edge conformal field theory. We corroborate our construction by numerically verifying the commutation relations of the generators. We also study the unitary flows generated by these operators, whose properties (such as energy and state overlap) are shown numerically to agree with our analytical predictions. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 60+20 pages, 28 figures

arXiv:2403.16477 [pdf, other]

Safeguarding Next Generation Multiple Access Using Physical Layer Security Techniques: A Tutorial

Authors: Lu Lv, Dongyang Xu, Rose Qingyang Hu, Yinghui Ye, Long Yang, Xianfu Lei, Xianbin Wang, Dong In Kim, Arumugam Nallanathan

Abstract: Driven by the ever-increasing requirements of ultra-high spectral efficiency, ultra-low latency, and massive connectivity, the forefront of wireless research calls for the design of advanced next generation multiple access schemes to facilitate provisioning of these stringent demands. This inspires the embrace of non-orthogonal multiple access (NOMA) in future wireless communication networks. Neve… ▽ More Driven by the ever-increasing requirements of ultra-high spectral efficiency, ultra-low latency, and massive connectivity, the forefront of wireless research calls for the design of advanced next generation multiple access schemes to facilitate provisioning of these stringent demands. This inspires the embrace of non-orthogonal multiple access (NOMA) in future wireless communication networks. Nevertheless, the support of massive access via NOMA leads to additional security threats, due to the open nature of the air interface, the broadcast characteristic of radio propagation as well as intertwined relationship among paired NOMA users. To address this specific challenge, the superimposed transmission of NOMA can be explored as new opportunities for security aware design, for example, multiuser interference inherent in NOMA can be constructively engineered to benefit communication secrecy and privacy. The purpose of this tutorial is to provide a comprehensive overview on the state-of-the-art physical layer security techniques that guarantee wireless security and privacy for NOMA networks, along with the opportunities, technical challenges, and future research trends. △ Less

Submitted 21 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Invited paper by Proceedings of the IEEE

arXiv:2403.14191 [pdf, other]

doi 10.1016/j.compbiomed.2024.108241

PECI-Net: Bolus segmentation from video fluoroscopic swallowing study images using preprocessing ensemble and cascaded inference

Authors: Dougho Park, Younghun Kim, Harim Kang, Junmyeoung Lee, Jinyoung Choi, Taeyeon Kim, Sangeok Lee, Seokil Son, Minsol Kim, Injung Kim

Abstract: Bolus segmentation is crucial for the automated detection of swallowing disorders in videofluoroscopic swallowing studies (VFSS). However, it is difficult for the model to accurately segment a bolus region in a VFSS image because VFSS images are translucent, have low contrast and unclear region boundaries, and lack color information. To overcome these challenges, we propose PECI-Net, a network arc… ▽ More Bolus segmentation is crucial for the automated detection of swallowing disorders in videofluoroscopic swallowing studies (VFSS). However, it is difficult for the model to accurately segment a bolus region in a VFSS image because VFSS images are translucent, have low contrast and unclear region boundaries, and lack color information. To overcome these challenges, we propose PECI-Net, a network architecture for VFSS image analysis that combines two novel techniques: the preprocessing ensemble network (PEN) and the cascaded inference network (CIN). PEN enhances the sharpness and contrast of the VFSS image by combining multiple preprocessing algorithms in a learnable way. CIN reduces ambiguity in bolus segmentation by using context from other regions through cascaded inference. Moreover, CIN prevents undesirable side effects from unreliably segmented regions by referring to the context in an asymmetric way. In experiments, PECI-Net exhibited higher performance than four recently developed baseline models, outperforming TernausNet, the best among the baseline models, by 4.54\% and the widely used UNet by 10.83\%. The results of the ablation studies confirm that CIN and PEN are effective in improving bolus segmentation performance. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 20 pages, 8 figures,

Journal ref: Computers in Biology and Medicine (2024)

arXiv:2403.13237 [pdf, ps, other]

Graph Attention Network-based Block Propagation with Optimal AoI and Reputation in Web 3.0

Authors: Jiana Liao, Jinbo Wen, Jiawen Kang, Changyan Yi, Yang Zhang, Yutao Jiao, Dusit Niyato, Dong In Kim, Shengli Xie

Abstract: Web 3.0 is recognized as a pioneering paradigm that empowers users to securely oversee data without reliance on a centralized authority. Blockchains, as a core technology to realize Web 3.0, can facilitate decentralized and transparent data management. Nevertheless, the evolution of blockchain-enabled Web 3.0 is still in its nascent phase, grappling with challenges such as ensuring efficiency and… ▽ More Web 3.0 is recognized as a pioneering paradigm that empowers users to securely oversee data without reliance on a centralized authority. Blockchains, as a core technology to realize Web 3.0, can facilitate decentralized and transparent data management. Nevertheless, the evolution of blockchain-enabled Web 3.0 is still in its nascent phase, grappling with challenges such as ensuring efficiency and reliability to enhance block propagation performance. In this paper, we design a Graph Attention Network (GAT)-based reliable block propagation optimization framework for blockchain-enabled Web 3.0. We first innovatively apply a data-freshness metric called age of block to measure block propagation efficiency in public blockchains. To achieve the reliability of block propagation, we introduce a reputation mechanism based on the subjective logic model, including the local and recommended opinions to calculate the miner reputation value. Moreover, considering that the GAT possesses the excellent ability to process graph-structured data, we utilize the GAT with reinforcement learning to obtain the optimal block propagation trajectory. Numerical results demonstrate that the proposed scheme exhibits the most outstanding block propagation efficiency and reliability compared with traditional routing mechanisms. △ Less

Submitted 8 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.08277 [pdf, other]

VIGFace: Virtual Identity Generation Model for Face Image Synthesis

Authors: Minsoo Kim, Min-Cheol Sagong, Gi Pyo Nam, Junghyun Cho, Ig-Jae Kim

Abstract: Deep learning-based face recognition continues to face challenges due to its reliance on huge datasets obtained from web crawling, which can be costly to gather and raise significant real-world privacy concerns. To address this issue, we propose VIGFace, a novel framework capable of generating synthetic facial images. Initially, we train the face recognition model using a real face dataset and cre… ▽ More Deep learning-based face recognition continues to face challenges due to its reliance on huge datasets obtained from web crawling, which can be costly to gather and raise significant real-world privacy concerns. To address this issue, we propose VIGFace, a novel framework capable of generating synthetic facial images. Initially, we train the face recognition model using a real face dataset and create a feature space for both real and virtual IDs where virtual prototypes are orthogonal to other prototypes. Subsequently, we generate synthetic images by using the diffusion model based on the feature space. Our proposed framework provides two significant benefits. Firstly, it allows for creating virtual facial images without concerns about portrait rights, guaranteeing that the generated virtual face images are clearly differentiated from existing individuals. Secondly, it serves as an effective augmentation method by incorporating real existing images. Further experiments demonstrate the efficacy of our framework, achieving state-of-the-art results from both perspectives without any external data. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.08256 [pdf, other]

IG-FIQA: Improving Face Image Quality Assessment through Intra-class Variance Guidance robust to Inaccurate Pseudo-Labels

Authors: Minsoo Kim, Gi Pyo Nam, Haksub Kim, Haesol Park, Ig-Jae Kim

Abstract: In the realm of face image quality assesment (FIQA), method based on sample relative classification have shown impressive performance. However, the quality scores used as pseudo-labels assigned from images of classes with low intra-class variance could be unrelated to the actual quality in this method. To address this issue, we present IG-FIQA, a novel approach to guide FIQA training, introducing… ▽ More In the realm of face image quality assesment (FIQA), method based on sample relative classification have shown impressive performance. However, the quality scores used as pseudo-labels assigned from images of classes with low intra-class variance could be unrelated to the actual quality in this method. To address this issue, we present IG-FIQA, a novel approach to guide FIQA training, introducing a weight parameter to alleviate the adverse impact of these classes. This method involves estimating sample intra-class variance at each iteration during training, ensuring minimal computational overhead and straightforward implementation. Furthermore, this paper proposes an on-the-fly data augmentation methodology for improved generalization performance in FIQA. On various benchmark datasets, our proposed method, IG-FIQA, achieved novel state-of-the-art (SOTA) performance. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.04925 [pdf, ps, other]

Near Field Communications for DMA-NOMA Networks

Authors: Zheng Zhang, Yuanwei Liu, Zhaolin Wang, Jian Chen, Dong In Kim

Abstract: A novel near-field transmission framework is proposed for dynamic metasurface antenna (DMA)-enabled non-orthogonal multiple access (NOMA) networks. The base station (BS) exploits the hybrid beamforming to communicate with multiple near users (NUs) and far users (FUs) using the NOMA principle. Based on this framework, two novel beamforming schemes are proposed. 1) For the case of the grouped users… ▽ More A novel near-field transmission framework is proposed for dynamic metasurface antenna (DMA)-enabled non-orthogonal multiple access (NOMA) networks. The base station (BS) exploits the hybrid beamforming to communicate with multiple near users (NUs) and far users (FUs) using the NOMA principle. Based on this framework, two novel beamforming schemes are proposed. 1) For the case of the grouped users distributed in the same direction, a beam-steering scheme is developed. The metric of beam pattern error (BPE) is introduced for the characterization of the gap between the hybrid beamformers and the desired ideal beamformers, where a two-layer algorithm is proposed to minimize BPE by optimizing hybrid beamformers. Then, the optimal power allocation strategy is obtained to maximize the sum achievable rate of the network. 2) For the case of users randomly distributed, a beam-splitting scheme is proposed, where two sub-beamformers are extracted from the single beamformer to serve different users in the same group. An alternating optimization (AO) algorithm is proposed for hybrid beamformer optimization, and the optimal power allocation is also derived. Numerical results validate that: 1) the proposed beamforming schemes exhibit superior performance compared with the existing imperfect-resolution-based beamforming scheme; 2) the communication rate of the proposed transmission framework is sensitive to the imperfect distance knowledge of NUs but not to that of FUs. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 13 pages

arXiv:2403.01376 [pdf, other]

Fault-tolerant Quantum Error Correction Using a Linear Array of Emitters

Authors: Jintae Kim, Jung Hoon Han, Isaac H. Kim

Abstract: We propose a fault-tolerant quantum error correction architecture consisting of a linear array of emitters and delay lines. In our scheme, a resource state for fault-tolerant quantum computation is generated by letting the emitters interact with a stream of photons and their neighboring emitters. In the absence of delay line errors, our schemes have thresholds ranging between 0.32% and 0.39% again… ▽ More We propose a fault-tolerant quantum error correction architecture consisting of a linear array of emitters and delay lines. In our scheme, a resource state for fault-tolerant quantum computation is generated by letting the emitters interact with a stream of photons and their neighboring emitters. In the absence of delay line errors, our schemes have thresholds ranging between 0.32% and 0.39% against the standard circuit-level depolarizing error model. Depending on the number of emitters n_e, we study the effect of delay line errors in two regimes: when n_e is a small constant of order unity and when n_e scales with the code distance. Between these two regimes, the logical error rate steadily decreases as n_e increases, from an exponential decay in eta^{-1/2} to an exponential decay in eta^{-1}. We also carry out a detailed study of the break-even point and the fault-tolerance overhead. These studies suggest that the multi-emitter architecture, using the state-of-the-art delay lines, can be used to demonstrate error suppression, assuming other sources of errors are sufficiently small. △ Less

Submitted 2 March, 2024; originally announced March 2024.

Comments: 17 pages, 12 figures

arXiv:2402.18921 [pdf, other]

Semi-Supervised U-statistics

Authors: Ilmun Kim, Larry Wasserman, Sivaraman Balakrishnan, Matey Neykov

Abstract: Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming. The prevalence of such datasets has consistently driven the demand for new tools and methods that exploit the potential of unlabeled data. Responding to this demand, we introduce semi-supervised U-statistics enhanced by the abundance of unlabeled data, and investigate thei… ▽ More Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming. The prevalence of such datasets has consistently driven the demand for new tools and methods that exploit the potential of unlabeled data. Responding to this demand, we introduce semi-supervised U-statistics enhanced by the abundance of unlabeled data, and investigate their statistical properties. We show that the proposed approach is asymptotically Normal and exhibits notable efficiency gains over classical U-statistics by effectively integrating various powerful prediction tools into the framework. To understand the fundamental difficulty of the problem, we derive minimax lower bounds in semi-supervised settings and showcase that our procedure is semi-parametrically efficient under regularity conditions. Moreover, tailored to bivariate kernels, we propose a refined approach that outperforms the classical U-statistic across all degeneracy regimes, and demonstrate its optimality properties. Simulation studies are conducted to corroborate our findings and to further demonstrate our framework. △ Less

Submitted 9 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Showing 1–50 of 875 results for author: Kim, I