-
DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models
Authors:
Fan Zhou,
Siqiao Xue,
Danrui Qi,
Wenhui Shi,
Wang Zhao,
Ganglin Wei,
Hongyang Zhang,
Caigai Jiang,
Gangwei Jiang,
Zhixuan Chu,
Faqiang Chen
Abstract:
Large language models (LLMs) becomes the dominant paradigm for the challenging task of text-to-SQL. LLM-empowered text-to-SQL methods are typically categorized into prompting-based and tuning approaches. Compared to prompting-based methods, benchmarking fine-tuned LLMs for text-to-SQL is important yet under-explored, partially attributed to the prohibitively high computational cost. In this paper,…
▽ More
Large language models (LLMs) becomes the dominant paradigm for the challenging task of text-to-SQL. LLM-empowered text-to-SQL methods are typically categorized into prompting-based and tuning approaches. Compared to prompting-based methods, benchmarking fine-tuned LLMs for text-to-SQL is important yet under-explored, partially attributed to the prohibitively high computational cost. In this paper, we present DB-GPT-Hub, an open benchmark suite for LLM-empowered text-to-SQL, which primarily focuses on tuning LLMs at large scales. The proposed benchmark consists of: 1. a standardized and comprehensive evaluation of text-to-SQL tasks by fine-tuning medium to large-sized open LLMs; 2. a modularized and easy-to-extend codebase with mainstream LLMs and experimental scenarios supported, which prioritizes fine-tuning methods but can be easily extended to prompt-based setting. Our work investigates the potential gains and the performance boundaries of tuning approaches, compared to prompting approaches and explores optimal solutions tailored to specific scenarios. We hope DB-GPT-Hub, along with these findings, enables further research and broad applications that would otherwise be difficult owing to the absence of a dedicated open benchmark. The project code has been released at https://github.com/eosphoros-ai/DB-GPT-Hub.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Embedded cylindrical and doughnut-shaped $λ$-hypersurfaces
Authors:
Qing-Ming Cheng,
Junqi Lai,
Guoxin Wei
Abstract:
In the paper, we construct, for $λ>0$, complete embedded and non-convex $λ$-hypersurfaces, which are diffeomorphic to a cylinder. Hence, one can not expect that $λ$-hypersurfaces share a common conclusion on the planar domain conjecture even if the planar domain conjecture of T. Ilmanen for self-shrinkers of mean curvature flow are solved by Brendle \cite{B} affirmatively. Furthermore, for a fixed…
▽ More
In the paper, we construct, for $λ>0$, complete embedded and non-convex $λ$-hypersurfaces, which are diffeomorphic to a cylinder. Hence, one can not expect that $λ$-hypersurfaces share a common conclusion on the planar domain conjecture even if the planar domain conjecture of T. Ilmanen for self-shrinkers of mean curvature flow are solved by Brendle \cite{B} affirmatively. Furthermore, for a fixed $λ<0$ which may have small $|λ|$, we can construct two compact embedded $λ$-hypersurfaces which are diffeomorphic to $\mathbb{S}^{1}\times \mathbb{S}^{n-1}$, but they are not isometric to each other.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Training-free Camera Control for Video Generation
Authors:
Chen Hou,
Guoqiang Wei,
Yan Zeng,
Zhibo Chen
Abstract:
We propose a training-free and robust solution to offer camera movement control for off-the-shelf video diffusion models. Unlike previous work, our method does not require any supervised finetuning on camera-annotated datasets or self-supervised training via data augmentation. Instead, it can be plugged and played with most pretrained video diffusion models and generate camera controllable videos…
▽ More
We propose a training-free and robust solution to offer camera movement control for off-the-shelf video diffusion models. Unlike previous work, our method does not require any supervised finetuning on camera-annotated datasets or self-supervised training via data augmentation. Instead, it can be plugged and played with most pretrained video diffusion models and generate camera controllable videos with a single image or text prompt as input. The inspiration of our work comes from the layout prior that intermediate latents hold towards generated results, thus rearranging noisy pixels in them will make output content reallocated as well. As camera move could also be seen as a kind of pixel rearrangement caused by perspective change, videos could be reorganized following specific camera motion if their noisy latents change accordingly. Established on this, we propose our method CamTrol, which enables robust camera control for video diffusion models. It is achieved by a two-stage process. First, we model image layout rearrangement through explicit camera movement in 3D point cloud space. Second, we generate videos with camera motion using layout prior of noisy latents formed by a series of rearranged images. Extensive experiments have demonstrated the robustness our method holds in controlling camera motion of generated videos. Furthermore, we show that our method can produce impressive results in generating 3D rotation videos with dynamic content. Project page at https://lifedecoder.github.io/CamTrol/.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Graph-Based Bidirectional Transformer Decision Threshold Adjustment Algorithm for Class-Imbalanced Molecular Data
Authors:
Nicole Hayes,
Ekaterina Merkurjev,
Guo-Wei Wei
Abstract:
Data sets with imbalanced class sizes, often where one class size is much smaller than that of others, occur extremely often in various applications, including those with biological foundations, such as drug discovery and disease diagnosis. Thus, it is extremely important to be able to identify data elements of classes of various sizes, as a failure to detect can result in heavy costs. However, ma…
▽ More
Data sets with imbalanced class sizes, often where one class size is much smaller than that of others, occur extremely often in various applications, including those with biological foundations, such as drug discovery and disease diagnosis. Thus, it is extremely important to be able to identify data elements of classes of various sizes, as a failure to detect can result in heavy costs. However, many data classification algorithms do not perform well on imbalanced data sets as they often fail to detect elements belonging to underrepresented classes. In this paper, we propose the BTDT-MBO algorithm, incorporating Merriman-Bence-Osher (MBO) techniques and a bidirectional transformer, as well as distance correlation and decision threshold adjustments, for data classification problems on highly imbalanced molecular data sets, where the sizes of the classes vary greatly. The proposed method not only integrates adjustments in the classification threshold for the MBO algorithm in order to help deal with the class imbalance, but also uses a bidirectional transformer model based on an attention mechanism for self-supervised learning. Additionally, the method implements distance correlation as a weight function for the similarity graph-based framework on which the adjusted MBO algorithm operates. The proposed model is validated using six molecular data sets, and we also provide a thorough comparison to other competing algorithms. The computational experiments show that the proposed method performs better than competing techniques even when the class imbalance ratio is very high.
△ Less
Submitted 19 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Nonlinear saturation of reversed shear Alfven eigenmode via high-frequency quasi-mode generation
Authors:
Zhiwen Cheng,
Guangyu Wei,
Lei Ye,
Zhiyong Qiu
Abstract:
A nonlinear saturation mechanism for reversed shear Alfven eigenmode (RSAE) is proposed and analysed, and is shown to be of relevance to typical reactor parameter region. The saturation is achieved through the generation of high-frequency quasi-mode due to nonlinear coupling of two RSAEs, which is then damped due to coupling with the shear Alfven continuum, and leads to the nonlinear saturation of…
▽ More
A nonlinear saturation mechanism for reversed shear Alfven eigenmode (RSAE) is proposed and analysed, and is shown to be of relevance to typical reactor parameter region. The saturation is achieved through the generation of high-frequency quasi-mode due to nonlinear coupling of two RSAEs, which is then damped due to coupling with the shear Alfven continuum, and leads to the nonlinear saturation of the primary RSAEs . An estimation of the nonlinear damping rate is also provided.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Benchmarking AlphaFold3's protein-protein complex accuracy and machine learning prediction reliability for binding free energy changes upon mutation
Authors:
JunJie Wee,
Guo-Wei Wei
Abstract:
AlphaFold 3 (AF3), the latest version of protein structure prediction software, goes beyond its predecessors by predicting protein-protein complexes. It could revolutionize drug discovery and protein engineering, marking a major step towards comprehensive, automated protein structure prediction. However, independent validation of AF3's predictions is necessary. Evaluated using the SKEMPI 2.0 datab…
▽ More
AlphaFold 3 (AF3), the latest version of protein structure prediction software, goes beyond its predecessors by predicting protein-protein complexes. It could revolutionize drug discovery and protein engineering, marking a major step towards comprehensive, automated protein structure prediction. However, independent validation of AF3's predictions is necessary. Evaluated using the SKEMPI 2.0 database which involves 317 protein-protein complexes and 8338 mutations, AF3 complex structures give rise to a very good Pearson correlation coefficient of 0.86 for predicting protein-protein binding free energy changes upon mutation, slightly less than the 0.88 achieved earlier with the Protein Data Bank (PDB) structures. Nonetheless, AF3 complex structures led to a 8.6% increase in the prediction RMSE compared to original PDB complex structures. Additionally, some of AF3's complex structures have large errors, which were not captured in its ipTM performance metric. Finally, it is found that AF3's complex structures are not reliable for intrinsically flexible regions or domains.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Evolutionary Khovanov homology
Authors:
Li Shen,
Jian Liu,
Guo-Wei Wei
Abstract:
Knot theory is a study of the embedding of closed circles into three-dimensional Euclidean space, motivated the ubiquity of knots in daily life and human civilization. However, the current knot theory focuses on the topology rather than metric. As such, the application of knot theory remains primitive and qualitative. Motivated by the need of quantitative knot data analysis (KDA), this work implem…
▽ More
Knot theory is a study of the embedding of closed circles into three-dimensional Euclidean space, motivated the ubiquity of knots in daily life and human civilization. However, the current knot theory focuses on the topology rather than metric. As such, the application of knot theory remains primitive and qualitative. Motivated by the need of quantitative knot data analysis (KDA), this work implements the metric into knot theory, the evolutionary Khovanov homology (EKH), to facilitate a multiscale KDA of real-world data. It is demonstrated that EKH exhibits non-trivial knot invariants at appropriate scales even if the global topological structure of a knot is simple. The proposed EKH has a great potential for KDA and knot learning.
△ Less
Submitted 16 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
SAM-VMNet: Deep Neural Networks For Coronary Angiography Vessel Segmentation
Authors:
Xueying Zeng,
Baixiang Huang,
Yu Luo,
Guangyu Wei,
Songyan He,
Yushuang Shao
Abstract:
Coronary artery disease (CAD) is one of the most prevalent diseases in the cardiovascular field and one of the major contributors to death worldwide. Computed Tomography Angiography (CTA) images are regarded as the authoritative standard for the diagnosis of coronary artery disease, and by performing vessel segmentation and stenosis detection on CTA images, physicians are able to diagnose coronary…
▽ More
Coronary artery disease (CAD) is one of the most prevalent diseases in the cardiovascular field and one of the major contributors to death worldwide. Computed Tomography Angiography (CTA) images are regarded as the authoritative standard for the diagnosis of coronary artery disease, and by performing vessel segmentation and stenosis detection on CTA images, physicians are able to diagnose coronary artery disease more accurately. In order to combine the advantages of both the base model and the domain-specific model, and to achieve high-precision and fully-automatic segmentation and detection with a limited number of training samples, we propose a novel architecture, SAM-VMNet, which combines the powerful feature extraction capability of MedSAM with the advantage of the linear complexity of the visual state-space model of VM-UNet, giving it faster inferences than Vision Transformer with faster inference speed and stronger data processing capability, achieving higher segmentation accuracy and stability for CTA images. Experimental results show that the SAM-VMNet architecture performs excellently in the CTA image segmentation task, with a segmentation accuracy of up to 98.32% and a sensitivity of up to 99.33%, which is significantly better than other existing models and has stronger domain adaptability. Comprehensive evaluation of the CTA image segmentation task shows that SAM-VMNet accurately extracts the vascular trunks and capillaries, demonstrating its great potential and wide range of application scenarios for the vascular segmentation task, and also laying a solid foundation for further stenosis detection.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Mamba-R: Vision Mamba ALSO Needs Registers
Authors:
Feng Wang,
Jiahao Wang,
Sucheng Ren,
Guoyizhe Wei,
Jieru Mei,
Wei Shao,
Yuyin Zhou,
Alan Yuille,
Cihang Xie
Abstract:
Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we…
▽ More
Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we follow the prior solution of introducing register tokens into Vision Mamba. To better cope with Mamba blocks' uni-directional inference paradigm, two key modifications are introduced: 1) evenly inserting registers throughout the input token sequence, and 2) recycling registers for final decision predictions. We term this new architecture Mamba-R. Qualitative observations suggest, compared to vanilla Vision Mamba, Mamba-R's feature maps appear cleaner and more focused on semantically meaningful regions. Quantitatively, Mamba-R attains stronger performance and scales better. For example, on the ImageNet benchmark, our base-size Mamba-R attains 82.9% accuracy, significantly outperforming Vim-B's 81.8%; furthermore, we provide the first successful scaling to the large model size (i.e., with 341M parameters), attaining a competitive accuracy of 83.2% (84.5% if finetuned with 384x384 inputs). Additional validation on the downstream semantic segmentation task also supports Mamba-R's efficacy.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Classification of Lagrangian translators and Lagrangian self-expanders in $\mathbb{C}^{2}$
Authors:
Zhi Li,
Guoxin Wei
Abstract:
In this paper, we obtain several classification results of $2$-dimensional complete Lagrangian translators and lagrangian self-expanders with constant squared norm $|\vec{H}|^{2}$ of the mean curvature vector in $\mathbb{C}^{2}$ by using a new Omori-Yau type maximum principle which was proved by Chen and Qiu \cite{CQ}. The same idea is also used to give a similar result of Lagrangian $ξ$-translato…
▽ More
In this paper, we obtain several classification results of $2$-dimensional complete Lagrangian translators and lagrangian self-expanders with constant squared norm $|\vec{H}|^{2}$ of the mean curvature vector in $\mathbb{C}^{2}$ by using a new Omori-Yau type maximum principle which was proved by Chen and Qiu \cite{CQ}. The same idea is also used to give a similar result of Lagrangian $ξ$-translators in $\mathbb{C}^{2}$.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Carbon Connect: An Ecosystem for Sustainable Computing
Authors:
Benjamin C. Lee,
David Brooks,
Arthur van Benthem,
Udit Gupta,
Gage Hills,
Vincent Liu,
Benjamin Pierce,
Christopher Stewart,
Emma Strubell,
Gu-Yeon Wei,
Adam Wierman,
Yuan Yao,
Minlan Yu
Abstract:
Computing is at a moment of profound opportunity. Emerging applications -- such as capable artificial intelligence, immersive virtual realities, and pervasive sensor systems -- drive unprecedented demand for computer. Despite recent advances toward net zero carbon emissions, the computing industry's gross energy usage continues to rise at an alarming rate, outpacing the growth of new energy instal…
▽ More
Computing is at a moment of profound opportunity. Emerging applications -- such as capable artificial intelligence, immersive virtual realities, and pervasive sensor systems -- drive unprecedented demand for computer. Despite recent advances toward net zero carbon emissions, the computing industry's gross energy usage continues to rise at an alarming rate, outpacing the growth of new energy installations and renewable energy deployments. A shift towards sustainability is needed to spark a transformation in how computer systems are manufactured, allocated, and consumed.
Carbon Connect envisions coordinated research thrusts that produce design and management strategies for sustainable, next-generation computer systems. These strategies must flatten and then reverse growth trajectories for computing power and carbon for society's most rapidly growing applications such as artificial intelligence and virtual spaces. We will require accurate models for carbon accounting in computing technology. For embodied carbon, we must re-think conventional design strategies -- over-provisioned monolithic servers, frequent hardware refresh cycles, custom silicon -- and adopt life-cycle design strategies that more effectively reduce, reuse and recycle hardware at scale. For operational carbon, we must not only embrace renewable energy but also design systems to use that energy more efficiently. Finally, new hardware design and management strategies must be cognizant of economic policy and regulatory landscape, aligning private initiatives with societal goals. Many of these broader goals will require computer scientists to develop deep, enduring collaborations with researchers in economics, law, and industrial ecology to spark change in broader practice.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
DocReLM: Mastering Document Retrieval with Language Model
Authors:
Gengchen Wei,
Xinle Pang,
Tianning Zhang,
Yu Sun,
Xun Qian,
Chen Lin,
Han-Sen Zhong,
Wanli Ouyang
Abstract:
With over 200 million published academic documents and millions of new documents being written each year, academic researchers face the challenge of searching for information within this vast corpus. However, existing retrieval systems struggle to understand the semantics and domain knowledge present in academic papers. In this work, we demonstrate that by utilizing large language models, a docume…
▽ More
With over 200 million published academic documents and millions of new documents being written each year, academic researchers face the challenge of searching for information within this vast corpus. However, existing retrieval systems struggle to understand the semantics and domain knowledge present in academic papers. In this work, we demonstrate that by utilizing large language models, a document retrieval system can achieve advanced semantic understanding capabilities, significantly outperforming existing systems. Our approach involves training the retriever and reranker using domain-specific data generated by large language models. Additionally, we utilize large language models to identify candidates from the references of retrieved papers to further enhance the performance. We use a test set annotated by academic researchers in the fields of quantum physics and computer vision to evaluate our system's performance. The results show that DocReLM achieves a Top 10 accuracy of 44.12% in computer vision, compared to Google Scholar's 15.69%, and an increase to 36.21% in quantum physics, while that of Google Scholar is 12.96%.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Finite Diffeomorphism Theorem for manifolds with lower Ricci curvature and bounded energy
Authors:
Wenshuai Jiang,
Guofang Wei
Abstract:
In this paper we prove that the space $\cM(n,\rv,D,Λ):=\{(M^n,g) \text{ closed }: ~~\Ric\ge -(n-1),~\Vol(M)\ge \rv>0, \diam(M)\le D \text{ and } \int_{M}|\Rm|^{n/2}\le Λ\}$ has at most $C(n,\rv,D,Λ)$ many diffeomorphism types. This removes the upper Ricci curvature bound of Anderson-Cheeger's finite diffeomorphism theorem in \cite{AnCh}. Furthermore, if $M$ is Kähler surface, the Riemann curvature…
▽ More
In this paper we prove that the space $\cM(n,\rv,D,Λ):=\{(M^n,g) \text{ closed }: ~~\Ric\ge -(n-1),~\Vol(M)\ge \rv>0, \diam(M)\le D \text{ and } \int_{M}|\Rm|^{n/2}\le Λ\}$ has at most $C(n,\rv,D,Λ)$ many diffeomorphism types. This removes the upper Ricci curvature bound of Anderson-Cheeger's finite diffeomorphism theorem in \cite{AnCh}. Furthermore, if $M$ is Kähler surface, the Riemann curvature $L^2$ bound could be replaced by the scalar curvature $L^2$ bound.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Volume growth and positive scalar curvature
Authors:
Guodong Wei,
Guoyi Xu,
Shuai Zhang
Abstract:
For three dimensional complete, non-compact Riemannian manifolds with non-negative Ricci curvature and uniformly positive scalar curvature, we obtain the sharp linear volume growth ratio and the corresponding rigidity.
For three dimensional complete, non-compact Riemannian manifolds with non-negative Ricci curvature and uniformly positive scalar curvature, we obtain the sharp linear volume growth ratio and the corresponding rigidity.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Navigating Chemical Space with Latent Flows
Authors:
Guanghao Wei,
Yining Huang,
Chenru Duan,
Yue Song,
Yuanqi Du
Abstract:
Recent progress of deep generative models in the vision and language domain has stimulated significant interest in more structured data generation such as molecules. However, beyond generating new random molecules, efficient exploration and a comprehensive understanding of the vast chemical space are of great importance to molecular science and applications in drug design and materials discovery.…
▽ More
Recent progress of deep generative models in the vision and language domain has stimulated significant interest in more structured data generation such as molecules. However, beyond generating new random molecules, efficient exploration and a comprehensive understanding of the vast chemical space are of great importance to molecular science and applications in drug design and materials discovery. In this paper, we propose a new framework, ChemFlow, to traverse chemical space through navigating the latent space learned by molecule generative models through flows. We introduce a dynamical system perspective that formulates the problem as learning a vector field that transports the mass of the molecular distribution to the region with desired molecular properties or structure diversity. Under this framework, we unify previous approaches on molecule latent space traversal and optimization and propose alternative competing methods incorporating different physical priors. We validate the efficacy of ChemFlow on molecule manipulation and single- and multi-objective molecule optimization tasks under both supervised and unsupervised molecular discovery settings. Codes and demos are publicly available on GitHub at https://github.com/garywei944/ChemFlow.
△ Less
Submitted 7 May, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
Is Flash Attention Stable?
Authors:
Alicia Golden,
Samuel Hsia,
Fei Sun,
Bilge Acun,
Basil Hosmer,
Yejin Lee,
Zachary DeVito,
Jeff Johnson,
Gu-Yeon Wei,
David Brooks,
Carole-Jean Wu
Abstract:
Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantify…
▽ More
Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantifying this is especially challenging given the costly nature of training runs. In this work, we develop a principled approach to understanding the effects of numeric deviation, and construct proxies to put observations into context when downstream effects are difficult to quantify. As a case study, we apply this framework to analyze the widely-adopted Flash Attention optimization. We find that Flash Attention sees roughly an order of magnitude more numeric deviation as compared to Baseline Attention at BF16 when measured during an isolated forward pass. We then use a data-driven analysis based on the Wasserstein Distance to provide upper bounds on how this numeric deviation impacts model weights during training, finding that the numerical deviation present in Flash Attention is 2-5 times less significant than low-precision training.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Topological Corner Modes by Composite Wannier States in Glide-Symmetric Photonic Crystal
Authors:
Zhenzhen Liu,
Xiaoxi Zhou,
Guochao Wei,
Lei Gao,
Bo hou,
Jun-Jun Xiao
Abstract:
Second-order topological insulators can be characterized by their bulk polarization, which is believed to be intrinsically connected to the center of the Wannier function. In this study, we demonstrate the existence of second-order topological insulators that feature a pair of partially degenerate photonic bands. These arise from the nonsymmorphic glide symmetry in an all-dielectric photonic cryst…
▽ More
Second-order topological insulators can be characterized by their bulk polarization, which is believed to be intrinsically connected to the center of the Wannier function. In this study, we demonstrate the existence of second-order topological insulators that feature a pair of partially degenerate photonic bands. These arise from the nonsymmorphic glide symmetry in an all-dielectric photonic crystal. The center of the maximally localized Wannier function (MLWF) is consistently located at the origin but is not equivalent with respect to the sum of constituent polarizations. As a result, topological corner modes can be identified by the distinctly hybridized MLWFs that truncate at the sample boundary. Through full-wave numerical simulations paired with microwave experiments, the second-order topology is clearly confirmed and characterized. These topological corner states exhibit notably unique modal symmetries, which are made possible by the inversion of the Wannier bands. Our results provide an alternative approach to explore higher-order topological physics with significant potential for applications in integrated and quantum photonics.
△ Less
Submitted 3 May, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results
Authors:
Yuekun Dai,
Dafeng Zhang,
Xiaoming Li,
Zongsheng Yue,
Chongyi Li,
Shangchen Zhou,
Ruicheng Feng,
Peiqing Yang,
Zhezhu Jin,
Guanqun Liu,
Chen Change Loy,
Lize Zhang,
Shuai Liu,
Chaoyu Feng,
Luyang Wang,
Shuan Chen,
Guangqi Shao,
Xiaotao Wang,
Lei Lei,
Qirui Yang,
Qihua Cheng,
Zhiqiang Xu,
Yihao Liu,
Huanjing Yue,
Jingyu Yang
, et al. (38 additional authors not shown)
Abstract:
The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra…
▽ More
The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Nighttime Flare Removal track on MIPI 2024. In total, 170 participants were successfully registered, and 14 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art performance on Nighttime Flare Removal. More details of this challenge and the link to the dataset can be found at https://mipi-challenge.org/MIPI2024/.
△ Less
Submitted 27 May, 2024; v1 submitted 30 April, 2024;
originally announced April 2024.
-
Persistent interaction topology in data analysis
Authors:
Jian Liu,
Dong Chen,
Guo-Wei Wei
Abstract:
Topological data analysis, as a tool for extracting topological features and characterizing geometric shapes, has experienced significant development across diverse fields. Its key mathematical techniques include persistent homology and the recently developed persistent Laplacians. However, classic mathematical models like simplicial complexes often struggle to provide a localized topological desc…
▽ More
Topological data analysis, as a tool for extracting topological features and characterizing geometric shapes, has experienced significant development across diverse fields. Its key mathematical techniques include persistent homology and the recently developed persistent Laplacians. However, classic mathematical models like simplicial complexes often struggle to provide a localized topological description for interactions or individual elements within a complex system involving a specific set of elements. In this work, we introduce persistent interaction homology and persistent interaction Laplacian that emphasize individual interacting elements in the system. We demonstrate the stability of persistent interaction homology as a persistent module. Furthermore, for a finite discrete set of points in the Euclidean space, we provide the construction of persistent interaction Vietoris-Rips complexes and compute their interaction homology and interaction Laplacians. The proposed methods hold significant promise for analyzing heterogeneously interactive data and emphasizing specific elements in data. Their utility for data science is demonstrated with applications to molecules.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models
Authors:
Siqiao Xue,
Danrui Qi,
Caigao Jiang,
Wenhui Shi,
Fangyin Cheng,
Keting Chen,
Hongjun Yang,
Zhiping Zhang,
Jianshan He,
Hongyang Zhang,
Ganglin Wei,
Wang Zhao,
Fan Zhou,
Hong Yi,
Shaodong Liu,
Hongjun Yang,
Faqiang Chen
Abstract:
The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. The technologies of interacting with data particularly have an important entanglement with LLMs as efficient and intuitive data interactions are paramount. In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interact…
▽ More
The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. The technologies of interacting with data particularly have an important entanglement with LLMs as efficient and intuitive data interactions are paramount. In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility. DB-GPT is designed to understand data interaction tasks described by natural language and provide context-aware responses powered by LLMs, making it an indispensable tool for users ranging from novice to expert. Its system design supports deployment across local, distributed, and cloud environments. Beyond handling basic data interaction tasks like Text-to-SQL with LLMs, it can handle complex tasks like generative data analysis through a Multi-Agents framework and the Agentic Workflow Expression Language (AWEL). The Service-oriented Multi-model Management Framework (SMMF) ensures data privacy and security, enabling users to employ DB-GPT with private LLMs. Additionally, DB-GPT offers a series of product-ready features designed to enable users to integrate DB-GPT within their product environments easily. The code of DB-GPT is available at Github(https://github.com/eosphoros-ai/DB-GPT) which already has over 10.7k stars. Please install DB-GPT for your own usage with the instructions(https://github.com/eosphoros-ai/DB-GPT#install) and watch a 5-minute introduction video on Youtube(https://youtu.be/n_8RI1ENyl4) to further investigate DB-GPT.
△ Less
Submitted 24 April, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Avoid Arguments and Escape with Your Self: Expressive Subtyping and Decidable Bidirectional Checking for Reachability Types
Authors:
Songlin Jia,
Guannan Wei,
Siyuan He,
Yuyan Bao,
Tiark Rompf
Abstract:
Despite Rust's success in systems programming, its ``shared XOR mutable'' principle significantly restricts how mutable values can be used, precluding many useful functional programming idioms. Reachability types are a recent proposal to address the key limitations of Rust-style approaches by tracking, rather than prohibiting, shared, escaping, and mutable data, even in the presence of higher-orde…
▽ More
Despite Rust's success in systems programming, its ``shared XOR mutable'' principle significantly restricts how mutable values can be used, precluding many useful functional programming idioms. Reachability types are a recent proposal to address the key limitations of Rust-style approaches by tracking, rather than prohibiting, shared, escaping, and mutable data, even in the presence of higher-order functions and polymorphic types. The key to enabling such expressiveness is the notion of self-references in reachability qualifiers. However, self-references present major challenges in designing expressive subtyping and decidable type checking algorithms, since self-references are neither fully covariant nor fully contravariant, yet still need to vary in certain circumstances. This lack of an effective type checking algorithm is a key impediment toward making reachability types truly practical, and leveraging them to bring the benefits of programming with lifetimes and sharing to practical higher-level languages.
In this paper, we investigate the issues of subtyping and type checking of self-references for reachability types. We address key gaps in previous work by proposing a refined notion of subtyping, which more smoothly supports features such as Church-encoded datatypes, making the overall system more expressive. We also develop a sound and decidable bidirectional type checking algorithm, implemented and verified in Coq.
△ Less
Submitted 15 July, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
Calculation of toroidal Alfvén eigenmode mode structure in general axisymmetric toroidal geometry
Authors:
Guangyu Wei,
Matteo Valerio Falessi,
Tao Wang,
Fulvio Zonca,
Zhiyong Qiu
Abstract:
A workflow is developed based on the ideal MHD model to investigate the linear physics of various Alfvén eigenmodes in general axisymmetric toroidal geometry, by solving the coupled shear Alfvén wave (SAW) and ion sound wave (ISW) equations in ballooning space. The model equations are solved by the FALCON code in the singular layer, and the corresponding solutions are then taken as the boundary co…
▽ More
A workflow is developed based on the ideal MHD model to investigate the linear physics of various Alfvén eigenmodes in general axisymmetric toroidal geometry, by solving the coupled shear Alfvén wave (SAW) and ion sound wave (ISW) equations in ballooning space. The model equations are solved by the FALCON code in the singular layer, and the corresponding solutions are then taken as the boundary conditions for calculating parallel mode structures in the whole ballooning space. As an application of the code, the frequencies and mode structures of toroidal Alfvén eigenmode (TAE) are calculated in the reference equilibria of the Divertor Tokamak Test facility (DTT) with positive and negative triangularities, respectively. By properly handling the boundary conditions, we demonstrate finite TAE damping due to coupling with the local acoustic continuum, and find that the damping rate is small for typical plasma parameters.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Drug resistance revealed by in silico deep mutational scanning and mutation tracker
Authors:
Dong Chen,
Gengzhuo Liu,
Hongyan Du,
Junjie Wee,
Rui Wang,
Jiahui Chen,
Jana Shen,
Guo-Wei Wei
Abstract:
As COVID-19 enters its fifth year, it continues to pose a significant global health threat, with the constantly mutating SARS-CoV-2 virus challenging drug effectiveness. A comprehensive understanding of virus-drug interactions is essential for predicting and improving drug effectiveness, especially in combating drug resistance during the pandemic. In response, the Path Laplacian Transformer-based…
▽ More
As COVID-19 enters its fifth year, it continues to pose a significant global health threat, with the constantly mutating SARS-CoV-2 virus challenging drug effectiveness. A comprehensive understanding of virus-drug interactions is essential for predicting and improving drug effectiveness, especially in combating drug resistance during the pandemic. In response, the Path Laplacian Transformer-based Prospective Analysis Framework (PLFormer-PAF) has been proposed, integrating historical data analysis and predictive modeling strategies. This dual-strategy approach utilizes path topology to transform protein-ligand complexes into topological sequences, enabling the use of advanced large language models for analyzing protein-ligand interactions and enhancing its reliability with factual insights garnered from historical data. It has shown unparalleled performance in predicting binding affinity tasks across various benchmarks, including specific evaluations related to SARS-CoV-2, and assesses the impact of virus mutations on drug efficacy, offering crucial insights into potential drug resistance. The predictions align with observed mutation patterns in SARS-CoV-2, indicating that the widespread use of the Pfizer drug has lead to viral evolution and reduced drug efficacy. PLFormer-PAF's capabilities extend beyond identifying drug-resistant strains, positioning it as a key tool in drug discovery research and the development of new therapeutic strategies against fast-mutating viruses like COVID-19.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Kaon production in HADES Au+Au collisions at $\sqrt{s_{\rm NN}}=2.4$ GeV
Authors:
Gao-Feng Wei,
Yu-Liang Zhao
Abstract:
Within an isospin- and momentum-dependent transport model by including the kaon production, we study the kaon prodution in heavy-ion collisions (HICs) at SIS (Darmstadt Schwerionen Synchrotron, GSI) energies. Based on simulations of a centrality of 0-40\% Au + Au collision at $\sqrt{s_{NN}}=2.4$ GeV, a typical reaction that has been carried out by the HADES Collaborator, we find that the medium mo…
▽ More
Within an isospin- and momentum-dependent transport model by including the kaon production, we study the kaon prodution in heavy-ion collisions (HICs) at SIS (Darmstadt Schwerionen Synchrotron, GSI) energies. Based on simulations of a centrality of 0-40\% Au + Au collision at $\sqrt{s_{NN}}=2.4$ GeV, a typical reaction that has been carried out by the HADES Collaborator, we find that the medium modification of kaon masses plays a vital role in studying the kaon productions in HICs, and is also unavoidable for the successful interpretation of the HADES data on kaon rapidity distributions and transverse mass spectra. Moreover, we also find that the kaon transverse and directed flows are affected significantly by both the kaon potential or dispersion relation and medium modification of kaon masses, and thus could be used in HICs as the sensitive probes to detect the kaon potential or dispersion relation as well as the medium modification of kaon masses.
△ Less
Submitted 16 April, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Guac: Energy-Aware and SSA-Based Generation of Coarse-Grained Merged Accelerators from LLVM-IR
Authors:
Iulian Brumar,
Rodrigo Rocha,
Alex Bernat,
Devashree Tripathy,
David Brooks,
Gu-Yeon Wei
Abstract:
Designing accelerators for resource- and power-constrained applications is a daunting task. High-level Synthesis (HLS) addresses these constraints through resource sharing, an optimization at the HLS binding stage that maps multiple operations to the same functional unit.
However, resource sharing is often limited to reusing instructions within a basic block. Instead of searching globally for th…
▽ More
Designing accelerators for resource- and power-constrained applications is a daunting task. High-level Synthesis (HLS) addresses these constraints through resource sharing, an optimization at the HLS binding stage that maps multiple operations to the same functional unit.
However, resource sharing is often limited to reusing instructions within a basic block. Instead of searching globally for the best control and dataflow graphs (CDFGs) to combine, it is constrained by existing instruction mappings and schedules.
Coarse-grained function merging (CGFM) at the intermediate representation (IR) level can reuse control and dataflow patterns without dealing with the post-scheduling complexity of mapping operations onto functional units, wires, and registers. The merged functions produced by CGFM can be translated to RTL by HLS, yielding Coarse Grained Merged Accelerators (CGMAs). CGMAs are especially profitable across applications with similar data- and control-flow patterns. Prior work has used CGFM to generate CGMAs without regard for which CGFM algorithms best optimize area, power, and energy costs.
We propose Guac, an energy-aware and SSA-based (static single assignment) CGMA generation methodology. Guac implements a novel ensemble of cost models for efficient CGMA generation. We also show that CGFM algorithms using SSA form to merge control- and dataflow graphs outperform prior non-SSA CGFM designs. We demonstrate significant area, power, and energy savings with respect to the state of the art. In particular, Guac more than doubles energy savings with respect to the closest related work while using a strong resource-sharing baseline.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Effects of incompressibility $K_{0}$ in heavy-ion collisions at intermediate energies
Authors:
Xiao-Xiao Long,
Gao-Feng Wei
Abstract:
Within the possible least uncertainty on the nuclear incompressibility $K_{0}$, we examine effects of $K_{0}$ in heavy-ion collisions at intermediate energies. Based on simulations of Au + Au collision at 400 MeV/nucleon using an isospin- and momentum-dependent transport model, we find that the incompressibility $K_{0}$ indeed affects significantly the attainable density in central regions, and th…
▽ More
Within the possible least uncertainty on the nuclear incompressibility $K_{0}$, we examine effects of $K_{0}$ in heavy-ion collisions at intermediate energies. Based on simulations of Au + Au collision at 400 MeV/nucleon using an isospin- and momentum-dependent transport model, we find that the incompressibility $K_{0}$ indeed affects significantly the attainable density in central regions, and thus the particle productions and/or distributions at final states, e.g., nucleon rapidity distributions and yields of charged pions. Nevertheless, through examining the free neutron over proton ratios $n/p$, the neutron-proton differential transverse and directed flows as well as the charged pion ratio $π^{-}/π^{+}$ and its kinetic energy distribution, we find that these observables are less affected by the uncertainty of $K_{0}$, but mainly sensitive to the slope of symmetry energy at the saturation density. We also compare and discuss our results with the corresponding data.
△ Less
Submitted 9 May, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Self-consistent Validation for Machine Learning Electronic Structure
Authors:
Gengyuan Hu,
Gengchen Wei,
Zekun Lou,
Philip H. S. Torr,
Wanli Ouyang,
Han-sen Zhong,
Chen Lin
Abstract:
Machine learning has emerged as a significant approach to efficiently tackle electronic structure problems. Despite its potential, there is less guarantee for the model to generalize to unseen data that hinders its application in real-world scenarios. To address this issue, a technique has been proposed to estimate the accuracy of the predictions. This method integrates machine learning with self-…
▽ More
Machine learning has emerged as a significant approach to efficiently tackle electronic structure problems. Despite its potential, there is less guarantee for the model to generalize to unseen data that hinders its application in real-world scenarios. To address this issue, a technique has been proposed to estimate the accuracy of the predictions. This method integrates machine learning with self-consistent field methods to achieve both low validation cost and interpret-ability. This, in turn, enables exploration of the model's ability with active learning and instills confidence in its integration into real-world studies.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Position: Topological Deep Learning is the New Frontier for Relational Learning
Authors:
Theodore Papamarkou,
Tolga Birdal,
Michael Bronstein,
Gunnar Carlsson,
Justin Curry,
Yue Gao,
Mustafa Hajij,
Roland Kwitt,
Pietro Liò,
Paolo Di Lorenzo,
Vasileios Maroulas,
Nina Miolane,
Farzana Nasrin,
Karthikeyan Natesan Ramamurthy,
Bastian Rieck,
Simone Scardapane,
Michael T. Schaub,
Petar Veličković,
Bei Wang,
Yusu Wang,
Guo-Wei Wei,
Ghada Zamzmi
Abstract:
Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning setting…
▽ More
Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning settings. To this end, this paper discusses open problems in TDL, ranging from practical benefits to theoretical foundations. For each problem, it outlines potential solutions and future research opportunities. At the same time, this paper serves as an invitation to the scientific community to actively participate in TDL research to unlock the potential of this emerging field.
△ Less
Submitted 30 May, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Authors:
Jiawei Wang,
Yuchen Zhang,
Jiaxin Zou,
Yan Zeng,
Guoqiang Wei,
Liping Yuan,
Hang Li
Abstract:
Generating rich and controllable motion is a pivotal challenge in video synthesis. We propose Boximator, a new approach for fine-grained motion control. Boximator introduces two constraint types: hard box and soft box. Users select objects in the conditional frame using hard boxes and then use either type of boxes to roughly or rigorously define the object's position, shape, or motion path in futu…
▽ More
Generating rich and controllable motion is a pivotal challenge in video synthesis. We propose Boximator, a new approach for fine-grained motion control. Boximator introduces two constraint types: hard box and soft box. Users select objects in the conditional frame using hard boxes and then use either type of boxes to roughly or rigorously define the object's position, shape, or motion path in future frames. Boximator functions as a plug-in for existing video diffusion models. Its training process preserves the base model's knowledge by freezing the original weights and training only the control module. To address training challenges, we introduce a novel self-tracking technique that greatly simplifies the learning of box-object correlations. Empirically, Boximator achieves state-of-the-art video quality (FVD) scores, improving on two base models, and further enhanced after incorporating box constraints. Its robust motion controllability is validated by drastic increases in the bounding box alignment metric. Human evaluation also shows that users favor Boximator generation results over the base model.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Flash: A Hybrid Private Inference Protocol for Deep CNNs with High Accuracy and Low Latency on CPU
Authors:
Hyeri Roh,
Jinsu Yeo,
Yeongil Ko,
Gu-Yeon Wei,
David Brooks,
Woo-Seok Choi
Abstract:
This paper presents Flash, an optimized private inference (PI) hybrid protocol utilizing both homomorphic encryption (HE) and secure two-party computation (2PC), which can reduce the end-to-end PI latency for deep CNN models less than 1 minute with CPU. To this end, first, Flash proposes a low-latency convolution algorithm built upon a fast slot rotation operation and a novel data encoding scheme,…
▽ More
This paper presents Flash, an optimized private inference (PI) hybrid protocol utilizing both homomorphic encryption (HE) and secure two-party computation (2PC), which can reduce the end-to-end PI latency for deep CNN models less than 1 minute with CPU. To this end, first, Flash proposes a low-latency convolution algorithm built upon a fast slot rotation operation and a novel data encoding scheme, which results in 4-94x performance gain over the state-of-the-art. Second, to minimize the communication cost introduced by the standard nonlinear activation function ReLU, Flash replaces the entire ReLUs with the polynomial $x^2+x$ and trains deep CNN models with the new activation function. The trained models improve the inference accuracy for CIFAR-10/100 and TinyImageNet by 16% on average (up to 40% for ResNet-32) compared to prior art. Last, Flash proposes an efficient 2PC-based $x^2+x$ evaluation protocol that does not require any offline communication and that reduces the total communication cost to process the activation layer by 84-196x over the state-of-the-art. As a result, the end-to-end PI latency of Flash implemented on CPU is 0.02 minute for CIFAR-100 and 0.57 minute for TinyImageNet classification, while the total data communication is 0.07GB for CIFAR-100 and 0.22GB for TinyImageNet. Flash improves the state-of-the-art PI by 16-45x in latency and 84-196x in communication cost. Moreover, even for ImageNet, Flash can deliver the latency less than 1 minute on CPU with the total communication less than 1GB.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Complete space-like self-expanders in the Minkovski space
Authors:
Zhi Li,
Guoxin Wei
Abstract:
It is our purpose to study complete space-like self-expanders in the Minkovski space. By use of maximum principle of Omori-Yau type, we can obtain the rigidity theorems on $n$-dimensional complete space-like self-expanders in the Minkovski space $\mathbb R^{n+1}_{1}$. For complete space-like self-expanders of dimension $2$, we give a classification of them under assumption of constant squared norm…
▽ More
It is our purpose to study complete space-like self-expanders in the Minkovski space. By use of maximum principle of Omori-Yau type, we can obtain the rigidity theorems on $n$-dimensional complete space-like self-expanders in the Minkovski space $\mathbb R^{n+1}_{1}$. For complete space-like self-expanders of dimension $2$, we give a classification of them under assumption of constant squared norm of the second fundamental form.
△ Less
Submitted 29 December, 2023;
originally announced January 2024.
-
DB-GPT: Empowering Database Interactions with Private Large Language Models
Authors:
Siqiao Xue,
Caigao Jiang,
Wenhui Shi,
Fangyin Cheng,
Keting Chen,
Hongjun Yang,
Zhiping Zhang,
Jianshan He,
Hongyang Zhang,
Ganglin Wei,
Wang Zhao,
Fan Zhou,
Danrui Qi,
Hong Yi,
Shaodong Liu,
Faqiang Chen
Abstract:
The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. Database technologies particularly have an important entanglement with LLMs as efficient and intuitive database interactions are paramount. In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user…
▽ More
The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. Database technologies particularly have an important entanglement with LLMs as efficient and intuitive database interactions are paramount. In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user experience and accessibility. DB-GPT is designed to understand natural language queries, provide context-aware responses, and generate complex SQL queries with high accuracy, making it an indispensable tool for users ranging from novice to expert. The core innovation in DB-GPT lies in its private LLM technology, which is fine-tuned on domain-specific corpora to maintain user privacy and ensure data security while offering the benefits of state-of-the-art LLMs. We detail the architecture of DB-GPT, which includes a novel retrieval augmented generation (RAG) knowledge system, an adaptive learning mechanism to continuously improve performance based on user feedback and a service-oriented multi-model framework (SMMF) with powerful data-driven agents. Our extensive experiments and user studies confirm that DB-GPT represents a paradigm shift in database interactions, offering a more natural, efficient, and secure way to engage with data repositories. The paper concludes with a discussion of the implications of DB-GPT framework on the future of human-database interaction and outlines potential avenues for further enhancements and applications in the field. The project code is available at https://github.com/eosphoros-ai/DB-GPT. Experience DB-GPT for yourself by installing it with the instructions https://github.com/eosphoros-ai/DB-GPT#install and view a concise 10-minute video at https://www.youtube.com/watch?v=KYs4nTDzEhk.
△ Less
Submitted 3 January, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
High-Fidelity Diffusion-based Image Editing
Authors:
Chen Hou,
Guoqiang Wei,
Zhibo Chen
Abstract:
Diffusion models have attained remarkable success in the domains of image generation and editing. It is widely recognized that employing larger inversion and denoising steps in diffusion model leads to improved image reconstruction quality. However, the editing performance of diffusion models tends to be no more satisfactory even with increasing denoising steps. The deficiency in editing could be…
▽ More
Diffusion models have attained remarkable success in the domains of image generation and editing. It is widely recognized that employing larger inversion and denoising steps in diffusion model leads to improved image reconstruction quality. However, the editing performance of diffusion models tends to be no more satisfactory even with increasing denoising steps. The deficiency in editing could be attributed to the conditional Markovian property of the editing process, where errors accumulate throughout denoising steps. To tackle this challenge, we first propose an innovative framework where a rectifier module is incorporated to modulate diffusion model weights with residual features, thereby providing compensatory information to bridge the fidelity gap. Furthermore, we introduce a novel learning paradigm aimed at minimizing error propagation during the editing process, which trains the editing procedure in a manner similar to denoising score-matching. Extensive experiments demonstrate that our proposed framework and training strategy achieve high-fidelity reconstruction and editing results across various levels of denoising steps, meanwhile exhibits exceptional performance in terms of both quantitative metric and qualitative assessments. Moreover, we explore our model's generalization through several applications like image-to-image translation and out-of-domain image editing.
△ Less
Submitted 4 January, 2024; v1 submitted 25 December, 2023;
originally announced December 2023.
-
README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP
Authors:
Zonghai Yao,
Nandyala Siddharth Kantu,
Guanghao Wei,
Hieu Tran,
Zhangqi Duan,
Sunjae Kwon,
Zhichao Yang,
README annotation team,
Hong Yu
Abstract:
The advancement in healthcare has shifted focus toward patient-centric approaches, particularly in self-care and patient education, facilitated by access to Electronic Health Records (EHR). However, medical jargon in EHRs poses significant challenges in patient comprehension. To address this, we introduce a new task of automatically generating lay definitions, aiming to simplify complex medical te…
▽ More
The advancement in healthcare has shifted focus toward patient-centric approaches, particularly in self-care and patient education, facilitated by access to Electronic Health Records (EHR). However, medical jargon in EHRs poses significant challenges in patient comprehension. To address this, we introduce a new task of automatically generating lay definitions, aiming to simplify complex medical terms into patient-friendly lay language. We first created the README dataset, an extensive collection of over 50,000 unique (medical term, lay definition) pairs and 300,000 mentions, each offering context-aware lay definitions manually annotated by domain experts. We have also engineered a data-centric Human-AI pipeline that synergizes data filtering, augmentation, and selection to improve data quality. We then used README as the training data for models and leveraged a Retrieval-Augmented Generation method to reduce hallucinations and improve the quality of model outputs. Our extensive automatic and human evaluations demonstrate that open-source mobile-friendly models, when fine-tuned with high-quality data, are capable of matching or even surpassing the performance of state-of-the-art closed-source large language models like ChatGPT. This research represents a significant stride in closing the knowledge gap in patient education and advancing patient-centric healthcare solutions.
△ Less
Submitted 16 June, 2024; v1 submitted 24 December, 2023;
originally announced December 2023.
-
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation
Authors:
Alicia Golden,
Samuel Hsia,
Fei Sun,
Bilge Acun,
Basil Hosmer,
Yejin Lee,
Zachary DeVito,
Jeff Johnson,
Gu-Yeon Wei,
David Brooks,
Carole-Jean Wu
Abstract:
As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation m…
▽ More
As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation models. Current model architecture designs are bifurcated into 2 categories: Diffusion- and Transformer-based models. Our systematic performance characterization on a suite of eight representative TTI/TTV models shows that after state-of-the-art optimization techniques such as Flash Attention are applied, Convolution accounts for up to 44% of execution time for Diffusion-based TTI models, while Linear layers consume up to 49% of execution time for Transformer-based models. We additionally observe that Diffusion-based TTI models resemble the Prefill stage of LLM inference, and benefit from 1.1-2.5x greater speedup from Flash Attention than Transformer-based TTI models that resemble the Decode phase. Since optimizations designed for LLMs do not map directly onto TTI/TTV models, we must conduct a thorough characterization of these workloads to gain insights for new optimization opportunities. In doing so, we define sequence length in the context of TTI/TTV models and observe sequence length can vary up to 4x in Diffusion model inference. We additionally observe temporal aspects of TTV workloads pose unique system bottlenecks, with Temporal Attention accounting for over 60% of total Attention time. Overall, our in-depth system performance characterization is a critical first step towards designing efficient and deployable systems for emerging TTI/TTV workloads.
△ Less
Submitted 5 May, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Poisson-Boltzmann based machine learning (PBML) model for electrostatic analysis
Authors:
Jiahui Chen,
Yongjia Xu,
Xin Yang,
Zixuan Cang,
Weihua Geng,
Guo-Wei Wei
Abstract:
Electrostatics is of paramount importance to chemistry, physics, biology, and medicine. The Poisson-Boltzmann (PB) theory is a primary model for electrostatic analysis. However, it is highly challenging to compute accurate PB electrostatic solvation free energies for macromolecules due to the nonlinearity, dielectric jumps, charge singularity , and geometric complexity associated with the PB equat…
▽ More
Electrostatics is of paramount importance to chemistry, physics, biology, and medicine. The Poisson-Boltzmann (PB) theory is a primary model for electrostatic analysis. However, it is highly challenging to compute accurate PB electrostatic solvation free energies for macromolecules due to the nonlinearity, dielectric jumps, charge singularity , and geometric complexity associated with the PB equation. The present work introduces a PB based machine learning (PBML) model for biomolecular electrostatic analysis. Trained with the second-order accurate MIBPB solver, the proposed PBML model is found to be more accurate and faster than several eminent PB solvers in electrostatic analysis. The proposed PBML model can provide highly accurate PB electrostatic solvation free energy of new biomolecules or new conformations generated by molecular dynamics with much reduced computational cost.
△ Less
Submitted 29 November, 2023;
originally announced December 2023.
-
Multiscale differential geometry learning of networks with applications to single-cell RNA sequencing data
Authors:
Hongsong Feng,
Sean Cottrell,
Yuta Hozumi,
Guo-Wei Wei
Abstract:
Single-cell RNA sequencing (scRNA-seq) has emerged as a transformative technology, offering unparalleled insights into the intricate landscape of cellular diversity and gene expression dynamics. The analysis of scRNA-seq data poses challenges attributed to both sparsity and the extensive number of genes implicated. An increasing number of computational tools are devised for analyzing and interpret…
▽ More
Single-cell RNA sequencing (scRNA-seq) has emerged as a transformative technology, offering unparalleled insights into the intricate landscape of cellular diversity and gene expression dynamics. The analysis of scRNA-seq data poses challenges attributed to both sparsity and the extensive number of genes implicated. An increasing number of computational tools are devised for analyzing and interpreting scRNA-seq data. We present a multiscale differential geometry (MDG) strategy to exploit the geometric and biological properties inherent in scRNA-seq data. We assume that those intrinsic properties of cells lies on a family of low-dimensional manifolds embedded in the high-dimensional space of scRNA-seq data. Subsequently, we explore these properties via multiscale cell-cell interactive manifolds. Our multiscale curvature-based representation serves as a powerful approach to effectively encapsulate the complex relationships in the cell-cell network. We showcase the utility of our novel approach by demonstrating its effectiveness in classifying cell types. This innovative application of differential geometry in scRNA-seq analysis opens new avenues for understanding the intricacies of biological networks and holds great potential for network analysis in other fields.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Persistent Topological Laplacians -- a Survey
Authors:
Xiaoqi Wei,
Guo-Wei Wei
Abstract:
Persistent topological Laplacians constitute a new class of tools in topological data analysis (TDA), motivated by the necessity to address challenges encountered in persistent homology when handling complex data. These Laplacians combines multiscale analysis with topological techniques to characterize the topological and geometrical features of functions and data. Their kernels fully retrieve the…
▽ More
Persistent topological Laplacians constitute a new class of tools in topological data analysis (TDA), motivated by the necessity to address challenges encountered in persistent homology when handling complex data. These Laplacians combines multiscale analysis with topological techniques to characterize the topological and geometrical features of functions and data. Their kernels fully retrieve the topological invariants of persistent homology, while their nonharmonic spectra provide supplementary information, such as the homotopic shape evolution of data. Persistent topological Laplacians have demonstrated superior performance over persistent homology in addressing large-scale protein engineering datasets. In this survey, we offer a pedagogical review of persistent topological Laplacians formulated on various mathematical objects, including simplicial complexes, path complexes, flag complexes, diraphs, hypergraphs, hyperdigraphs, cellular sheaves, as well as $N$-chain complexes. Alongside fundamental mathematical concepts, we emphasize the theoretical formulations associated with various persistent topological Laplacians and illustrate their applications through numerous simple geometric shapes.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Persistent Directed Flag Laplacian
Authors:
Benjamin Jones,
Guowei Wei
Abstract:
Topological data analysis (TDA) has had enormous success in science and engineering in the past decade. Persistent topological Laplacians (PTLs) overcome some limitations of persistent homology, a key technique in TDA, and provide substantial insight to the behavior of various geometric and topological objects. This work extends PTLs to directed flag complexes, which are an exciting generalization…
▽ More
Topological data analysis (TDA) has had enormous success in science and engineering in the past decade. Persistent topological Laplacians (PTLs) overcome some limitations of persistent homology, a key technique in TDA, and provide substantial insight to the behavior of various geometric and topological objects. This work extends PTLs to directed flag complexes, which are an exciting generalization to flag complexes, also known as clique complexes, that arise naturally in many situations. We introduce the directed flag Laplacian and show that the proposed persistent directed flag Laplacian (PDFL) is a distinct way of analyzing these flag complexes. Example calculations are provided to demonstrate the potential of the proposed PDFL in real world applications.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Multiscale Topology in Interactomic Network: From Transcriptome to Antiaddiction Drug Repurposing
Authors:
Hongyan Du,
Guo-Wei Wei,
Tingjun Hou
Abstract:
The escalating drug addiction crisis in the United States underscores the urgent need for innovative therapeutic strategies. This study embarked on an innovative and rigorous strategy to unearth potential drug repurposing candidates for opioid and cocaine addiction treatment, bridging the gap between transcriptomic data analysis and drug discovery. We initiated our approach by conducting different…
▽ More
The escalating drug addiction crisis in the United States underscores the urgent need for innovative therapeutic strategies. This study embarked on an innovative and rigorous strategy to unearth potential drug repurposing candidates for opioid and cocaine addiction treatment, bridging the gap between transcriptomic data analysis and drug discovery. We initiated our approach by conducting differential gene expression analysis on addiction-related transcriptomic data to identify key genes. We propose a novel topological differentiation to identify key genes from a protein-protein interaction (PPI) network derived from DEGs. This method utilizes persistent Laplacians to accurately single out pivotal nodes within the network, conducting this analysis in a multiscale manner to ensure high reliability. Through rigorous literature validation, pathway analysis, and data-availability scrutiny, we identified three pivotal molecular targets, mTOR, mGluR5, and NMDAR, for drug repurposing from DrugBank. We crafted machine learning models employing two natural language processing (NLP)-based embeddings and a traditional 2D fingerprint, which demonstrated robust predictive ability in gauging binding affinities of DrugBank compounds to selected targets. Furthermore, we elucidated the interactions of promising drugs with the targets and evaluated their drug-likeness. This study delineates a multi-faceted and comprehensive analytical framework, amalgamating bioinformatics, topological data analysis and machine learning, for drug repurposing in addiction treatment, setting the stage for subsequent experimental validation. The versatility of the methods we developed allows for applications across a range of diseases and transcriptomic datasets.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Persistent Mayer homology and persistent Mayer Laplacian
Authors:
Li Shen,
Jian Liu,
Guo-Wei Wei
Abstract:
In algebraic topology, the differential (i.e., boundary operator) typically satisfies $d^{2}=0$. However, the generalized differential $d^{N}=0$ for an integer $N\geq 2$ has been studied in terms of Mayer homology on $N$-chain complexes for more than eighty years. We introduce Mayer Laplacians on $N$-chain complexes. We show that both Mayer homology and Mayer Laplacians offer considerable applicat…
▽ More
In algebraic topology, the differential (i.e., boundary operator) typically satisfies $d^{2}=0$. However, the generalized differential $d^{N}=0$ for an integer $N\geq 2$ has been studied in terms of Mayer homology on $N$-chain complexes for more than eighty years. We introduce Mayer Laplacians on $N$-chain complexes. We show that both Mayer homology and Mayer Laplacians offer considerable application potential, providing topological and geometric insights to spaces. We also introduce persistent Mayer homology and persistent Mayer Laplacians at various $N$. The Wasserstein distance and stability of persistence diagrams associated with Mayer homology are investigated. Our computational experiments indicate that the topological features offered by persistent Mayer homology and spectrum given by persistent Mayer Laplacians hold substantial promise for large, complex, and diverse data. We envision that the present work serves as an inaugural step towards integrating Mayer homology and Mayer Laplacians into the realm of topological data analysis.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Mechanical control of quantum transport in graphene
Authors:
A. C. McRae,
G. Wei,
L. Huang,
S. Yigen,
V. Tayari,
A. R. Champagne
Abstract:
Two-dimensional materials (2DMs) are fundamentally electro-mechanical systems. Their environment unavoidably strains them and modifies their quantum transport properties. For instance, a simple uniaxial strain could completely turn off the conductivity of ballistic graphene or switch on/off the superconducting phase of magic-angle bilayer graphene. Here we report measurements of quantum transport…
▽ More
Two-dimensional materials (2DMs) are fundamentally electro-mechanical systems. Their environment unavoidably strains them and modifies their quantum transport properties. For instance, a simple uniaxial strain could completely turn off the conductivity of ballistic graphene or switch on/off the superconducting phase of magic-angle bilayer graphene. Here we report measurements of quantum transport in strained graphene which agree quantitatively with models based on mechanically-induced gauge potentials. We mechanically induce in-situ a scalar potential, which modifies graphene's work function by up to 25 meV, and vector potentials which suppress the ballistic conductivity of graphene by up to 30 % and control its quantum interferences. To do so, we developed an experimental platform able to precisely tune both the mechanics and electrostatics of suspended graphene transistors at low-temperature over a broad range of strain (up to 2.6 %). This work opens many opportunities to experimentally explore quantitative strain effects in 2DM quantum transport and technologies.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
Interaction homotopy and interaction homology
Authors:
Jian Liu,
Dong Chen,
Guo-Wei Wei
Abstract:
Interactions in complex systems are widely observed across various fields, drawing increased attention from researchers. In mathematics, efforts are made to develop various theories and methods for studying the interactions between spaces. In this work, we present an algebraic topology framework to explore interactions between spaces. We introduce the concept of interaction spaces and investigate…
▽ More
Interactions in complex systems are widely observed across various fields, drawing increased attention from researchers. In mathematics, efforts are made to develop various theories and methods for studying the interactions between spaces. In this work, we present an algebraic topology framework to explore interactions between spaces. We introduce the concept of interaction spaces and investigate their homotopy, singular homology, and simplicial homology. Furthermore, we demonstrate that interaction singular homology serves as an invariant under interaction homotopy. We believe that the proposed framework holds potential for practical applications.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Persistent Dirac of Path and Hypergraph
Authors:
Faisal Suwayyid,
Guo-Wei Wei
Abstract:
This work introduces the development of path Dirac and hypergraph Dirac operators, along with an exploration of their persistence. These operators excel in distinguishing between harmonic and non-harmonic spectra, offering valuable insights into the subcomplexes within these structures. The paper showcases the functionality of these operators through a series of examples in various contexts. An im…
▽ More
This work introduces the development of path Dirac and hypergraph Dirac operators, along with an exploration of their persistence. These operators excel in distinguishing between harmonic and non-harmonic spectra, offering valuable insights into the subcomplexes within these structures. The paper showcases the functionality of these operators through a series of examples in various contexts. An important facet of this research involves examining the operators' sensitivity to filtration, emphasizing their capacity to adapt to topological changes. The paper also explores a significant application of persistent path Dirac and persistent hypergraph Dirac in the field of molecular science, specifically in the analysis of molecular structures. The study introduces strict preorders derived from molecular structures, which generate graphs and digraphs with intricate path structures. The depth of information within these path complexes reflects the complexity of different preorder classes influenced by molecular structures. This characteristic underscores the effectiveness of these tools in the realm of topological data analysis.
△ Less
Submitted 3 December, 2023; v1 submitted 24 November, 2023;
originally announced November 2023.
-
Hardware Resilience Properties of Text-Guided Image Classifiers
Authors:
Syed Talal Wasim,
Kabila Haile Soboka,
Abdulrahman Mahmoud,
Salman Khan,
David Brooks,
Gu-Yeon Wei
Abstract:
This paper presents a novel method to enhance the reliability of image classification models during deployment in the face of transient hardware errors. By utilizing enriched text embeddings derived from GPT-3 with question prompts per class and CLIP pretrained text encoder, we investigate their impact as an initialization for the classification layer. Our approach achieves a remarkable…
▽ More
This paper presents a novel method to enhance the reliability of image classification models during deployment in the face of transient hardware errors. By utilizing enriched text embeddings derived from GPT-3 with question prompts per class and CLIP pretrained text encoder, we investigate their impact as an initialization for the classification layer. Our approach achieves a remarkable $5.5\times$ average increase in hardware reliability (and up to $14\times$) across various architectures in the most critical layer, with minimal accuracy drop ($0.3\%$ on average) compared to baseline PyTorch models. Furthermore, our method seamlessly integrates with any image classification backbone, showcases results across various network architectures, decreases parameter and FLOPs overhead, and follows a consistent training recipe. This research offers a practical and efficient solution to bolster the robustness of image classification models against hardware failures, with potential implications for future studies in this domain. Our code and models are released at https://github.com/TalalWasim/TextGuidedResilience.
△ Less
Submitted 5 December, 2023; v1 submitted 23 November, 2023;
originally announced November 2023.
-
Knot data analysis using multiscale Gauss link integral
Authors:
Li Shen,
Hongsong Feng,
Fengling Li,
Fengchun Lei,
Jie Wu,
Guo-Wei Wei
Abstract:
In the past decade, topological data analysis (TDA) has emerged as a powerful approach in data science. The main technique in TDA is persistent homology, which tracks topological invariants over the filtration of point cloud data using algebraic topology. Although knot theory and related subjects are a focus of study in mathematics, their success in practical applications is quite limited due to t…
▽ More
In the past decade, topological data analysis (TDA) has emerged as a powerful approach in data science. The main technique in TDA is persistent homology, which tracks topological invariants over the filtration of point cloud data using algebraic topology. Although knot theory and related subjects are a focus of study in mathematics, their success in practical applications is quite limited due to the lack of localization and quantization. We address these challenges by introducing knot data analysis (KDA), a new paradigm that incorporating curve segmentation and multiscale analysis into the Gauss link integral. The resulting multiscale Gauss link integral (mGLI) recovers the global topological properties of knots and links at an appropriate scale but offers multiscale feature vectors to capture the local structures and connectivities of each curve segment at various scales. The proposed mGLI significantly outperforms other state-of-the-art methods in benchmark protein flexibility analysis, including earlier persistent homology-based methods. Our approach enables the integration of artificial intelligence (AI) and KDA for general curve-like objects and data.
△ Less
Submitted 2 October, 2023;
originally announced November 2023.
-
Colding-Minicozzi entropies of some self-shrinkers
Authors:
Qilun Luo,
Guoxin Wei,
Fu-An Zhang
Abstract:
In this note, we numerically estimate Colding-Minicozzi entropies of some self-shrinkers and get that Colding-Minicozzi entropies of $n$-dimensional Angenent torus are decreasing about dimension $n$ ($2\leq n\leq 5*10^7$), which partially answer the questions of Berchenko-Kogan \cite{BK}.
In this note, we numerically estimate Colding-Minicozzi entropies of some self-shrinkers and get that Colding-Minicozzi entropies of $n$-dimensional Angenent torus are decreasing about dimension $n$ ($2\leq n\leq 5*10^7$), which partially answer the questions of Berchenko-Kogan \cite{BK}.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Make Pixels Dance: High-Dynamic Video Generation
Authors:
Yan Zeng,
Guoqiang Wei,
Jiani Zheng,
Jiaxin Zou,
Yang Wei,
Yuchen Zhang,
Hang Li
Abstract:
Creating high-dynamic videos such as motion-rich actions and sophisticated visual effects poses a significant challenge in the field of artificial intelligence. Unfortunately, current state-of-the-art video generation methods, primarily focusing on text-to-video generation, tend to produce video clips with minimal motions despite maintaining high fidelity. We argue that relying solely on text inst…
▽ More
Creating high-dynamic videos such as motion-rich actions and sophisticated visual effects poses a significant challenge in the field of artificial intelligence. Unfortunately, current state-of-the-art video generation methods, primarily focusing on text-to-video generation, tend to produce video clips with minimal motions despite maintaining high fidelity. We argue that relying solely on text instructions is insufficient and suboptimal for video generation. In this paper, we introduce PixelDance, a novel approach based on diffusion models that incorporates image instructions for both the first and last frames in conjunction with text instructions for video generation. Comprehensive experimental results demonstrate that PixelDance trained with public data exhibits significantly better proficiency in synthesizing videos with complex scenes and intricate motions, setting a new standard for video generation.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Compliant actuators that mimic biological muscle performance with applications in a highly biomimetic robotic arm
Authors:
Haosen Yang,
Guowu Wei,
Lei Ren,
Lingyun Yan
Abstract:
This paper endeavours to bridge the existing gap in muscular actuator design for ligament-skeletal-inspired robots, thereby fostering the evolution of these robotic systems. We introduce two novel compliant actuators, namely the Internal Torsion Spring Compliant Actuator (ICA) and the External Spring Compliant Actuator (ECA), and present a comparative analysis against the previously conceived Magn…
▽ More
This paper endeavours to bridge the existing gap in muscular actuator design for ligament-skeletal-inspired robots, thereby fostering the evolution of these robotic systems. We introduce two novel compliant actuators, namely the Internal Torsion Spring Compliant Actuator (ICA) and the External Spring Compliant Actuator (ECA), and present a comparative analysis against the previously conceived Magnet Integrated Soft Actuator (MISA) through computational and experimental results. These actuators, employing a motor-tendon system, emulate biological muscle-like forms, enhancing artificial muscle technology. A robotic arm application inspired by the skeletal ligament system is presented. Experiments demonstrate satisfactory power in tasks like lifting dumbbells (peak power: 36W), playing table tennis (end-effector speed: 3.2 m/s), and door opening, without compromising biomimetic aesthetics. Compared to other linear stiffness serial elastic actuators (SEAs), ECA and ICA exhibit high power-to-volume (361 x 10^3 W/m) and power-to-mass (111.6 W/kg) ratios respectively, endorsing the biomimetic design's promise in robotic development.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Rethinking Semi-Supervised Imbalanced Node Classification from Bias-Variance Decomposition
Authors:
Divin Yan,
Gengchen Wei,
Chen Yang,
Shengzhong Zhang,
Zengfeng Huang
Abstract:
This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data. Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance. We also leverage graph augmentation technique to estimate the variance,…
▽ More
This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data. Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance. We also leverage graph augmentation technique to estimate the variance, and design a regularization term to alleviate the impact of imbalance. Exhaustive tests are conducted on multiple benchmarks, including naturally imbalanced datasets and public-split class-imbalanced datasets, demonstrating that our approach outperforms state-of-the-art methods in various imbalanced scenarios. This work provides a novel theoretical perspective for addressing the problem of imbalanced node classification in GNNs.
△ Less
Submitted 5 February, 2024; v1 submitted 28 October, 2023;
originally announced October 2023.