-
Leveraging Multi-facet Paths for Heterogeneous Graph Representation Learning
Authors:
JongWoo Kim,
SeongYeub Chu,
HyeongMin Park,
Bryan Wong,
MunYong Yi
Abstract:
Recent advancements in graph neural networks (GNNs) and heterogeneous GNNs (HGNNs) have advanced node embeddings and relationship learning for various tasks. However, existing methods often rely on domain-specific predefined meta-paths, which are coarse-grained and focus solely on aspects like node type, limiting their ability to capture complex interactions. We introduce MF2Vec, a model that uses…
▽ More
Recent advancements in graph neural networks (GNNs) and heterogeneous GNNs (HGNNs) have advanced node embeddings and relationship learning for various tasks. However, existing methods often rely on domain-specific predefined meta-paths, which are coarse-grained and focus solely on aspects like node type, limiting their ability to capture complex interactions. We introduce MF2Vec, a model that uses multi-faceted (fine-grained) paths instead of predefined meta-paths. MF2Vec extracts paths via random walks and generates multi-faceted vectors, ignoring predefined schemas. This method learns diverse aspects of nodes and their relationships, constructs a homogeneous network, and creates node embeddings for classification, link prediction, and clustering. Extensive experiments show that MF2Vec outperforms existing methods, offering a more flexible and comprehensive framework for analyzing complex networks. The code is available at https://anonymous.4open.science/r/MF2Vec-6ABC.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
Synthetic Patients: Simulating Difficult Conversations with Multimodal Generative AI for Medical Education
Authors:
Simon N. Chu,
Alex J. Goodell
Abstract:
Problem: Effective patient-centered communication is a core competency for physicians. However, both seasoned providers and medical trainees report decreased confidence in leading conversations on sensitive topics such as goals of care or end-of-life discussions. The significant administrative burden and the resources required to provide dedicated training in leading difficult conversations has be…
▽ More
Problem: Effective patient-centered communication is a core competency for physicians. However, both seasoned providers and medical trainees report decreased confidence in leading conversations on sensitive topics such as goals of care or end-of-life discussions. The significant administrative burden and the resources required to provide dedicated training in leading difficult conversations has been a long-standing problem in medical education.
Approach: In this work, we present a novel educational tool designed to facilitate interactive, real-time simulations of difficult conversations in a video-based format through the use of multimodal generative artificial intelligence (AI). Leveraging recent advances in language modeling, computer vision, and generative audio, this tool creates realistic, interactive scenarios with avatars, or "synthetic patients." These synthetic patients interact with users throughout various stages of medical care using a custom-built video chat application, offering learners the chance to practice conversations with patients from diverse belief systems, personalities, and ethnic backgrounds.
Outcomes: While the development of this platform demanded substantial upfront investment in labor, it offers a highly-realistic simulation experience with minimal financial investment. For medical trainees, this educational tool can be implemented within programs to simulate patient-provider conversations and can be incorporated into existing palliative care curriculum to provide a scalable, high-fidelity simulation environment for mastering difficult conversations.
Next Steps: Future developments will explore enhancing the authenticity of these encounters by working with patients to incorporate their histories and personalities, as well as employing the use of AI-generated evaluations to offer immediate, constructive feedback to learners post-simulation.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Memory-Maze: Scenario Driven Benchmark and Visual Language Navigation Model for Guiding Blind People
Authors:
Masaki Kuribayashi,
Kohei Uehara,
Allan Wang,
Daisuke Sato,
Simon Chu,
Shigeo Morishima
Abstract:
Visual Language Navigation (VLN) powered navigation robots have the potential to guide blind people by understanding and executing route instructions provided by sighted passersby. This capability allows robots to operate in environments that are often unknown a priori. Existing VLN models are insufficient for the scenario of navigation guidance for blind people, as they need to understand routes…
▽ More
Visual Language Navigation (VLN) powered navigation robots have the potential to guide blind people by understanding and executing route instructions provided by sighted passersby. This capability allows robots to operate in environments that are often unknown a priori. Existing VLN models are insufficient for the scenario of navigation guidance for blind people, as they need to understand routes described from human memory, which frequently contain stutters, errors, and omission of details as opposed to those obtained by thinking out loud, such as in the Room-to-Room dataset. However, currently, there is no benchmark that simulates instructions that were obtained from human memory in environments where blind people navigate. To this end, we present our benchmark, Memory-Maze, which simulates the scenario of seeking route instructions for guiding blind people. Our benchmark contains a maze-like structured virtual environment and novel route instruction data from human memory. To collect natural language instructions, we conducted two studies from sighted passersby onsite and annotators online. Our analysis demonstrates that instructions data collected onsite were more lengthy and contained more varied wording. Alongside our benchmark, we propose a VLN model better equipped to handle the scenario. Our proposed VLN model uses Large Language Models (LLM) to parse instructions and generate Python codes for robot control. We further show that the existing state-of-the-art model performed suboptimally on our benchmark. In contrast, our proposed method outperformed the state-of-the-art model by a fair margin. We found that future research should exercise caution when considering VLN technology for practical applications, as real-world scenarios have different characteristics than ones collected in traditional settings.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution
Authors:
Shu-Chuan Chu,
Zhi-Chao Dou,
Jeng-Shyang Pan,
Shaowei Weng,
Junbao Li
Abstract:
Transformer-based methods have demonstrated excellent performance on super-resolution visual tasks, surpassing conventional convolutional neural networks. However, existing work typically restricts self-attention computation to non-overlapping windows to save computational costs. This means that Transformer-based networks can only use input information from a limited spatial range. Therefore, a no…
▽ More
Transformer-based methods have demonstrated excellent performance on super-resolution visual tasks, surpassing conventional convolutional neural networks. However, existing work typically restricts self-attention computation to non-overlapping windows to save computational costs. This means that Transformer-based networks can only use input information from a limited spatial range. Therefore, a novel Hybrid Multi-Axis Aggregation network (HMA) is proposed in this paper to exploit feature potential information better. HMA is constructed by stacking Residual Hybrid Transformer Blocks(RHTB) and Grid Attention Blocks(GAB). On the one side, RHTB combines channel attention and self-attention to enhance non-local feature fusion and produce more attractive visual results. Conversely, GAB is used in cross-domain information interaction to jointly model similar features and obtain a larger perceptual field. For the super-resolution task in the training phase, a novel pre-training method is designed to enhance the model representation capabilities further and validate the proposed model's effectiveness through many experiments. The experimental results show that HMA outperforms the state-of-the-art methods on the benchmark dataset. We provide code and models at https://github.com/korouuuuu/HMA.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
R2D2 image reconstruction with model uncertainty quantification in radio astronomy
Authors:
Amir Aghabiglou,
Chung San Chu,
Arwa Dabbech,
Yves Wiaux
Abstract:
The ``Residual-to-Residual DNN series for high-Dynamic range imaging'' (R2D2) approach was recently introduced for Radio-Interferometric (RI) imaging in astronomy. R2D2's reconstruction is formed as a series of residual images, iteratively estimated as outputs of Deep Neural Networks (DNNs) taking the previous iteration's image estimate and associated data residual as inputs. In this work, we inve…
▽ More
The ``Residual-to-Residual DNN series for high-Dynamic range imaging'' (R2D2) approach was recently introduced for Radio-Interferometric (RI) imaging in astronomy. R2D2's reconstruction is formed as a series of residual images, iteratively estimated as outputs of Deep Neural Networks (DNNs) taking the previous iteration's image estimate and associated data residual as inputs. In this work, we investigate the robustness of the R2D2 image estimation process, by studying the uncertainty associated with its series of learned models. Adopting an ensemble averaging approach, multiple series can be trained, arising from different random DNN initializations of the training process at each iteration. The resulting multiple R2D2 instances can also be leveraged to generate ``R2D2 samples'', from which empirical mean and standard deviation endow the algorithm with a joint estimation and uncertainty quantification functionality. Focusing on RI imaging, and adopting a telescope-specific approach, multiple R2D2 instances were trained to encompass the most general observation setting of the Very Large Array (VLA). Simulations and real-data experiments confirm that: (i) R2D2's image estimation capability is superior to that of the state-of-the-art algorithms; (ii) its ultra-fast reconstruction capability (arising from series with only few DNNs) makes the computation of multiple reconstruction samples and of uncertainty maps practical even at large image dimension; (iii) it is characterized by a very low model uncertainty.
△ Less
Submitted 27 May, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Scalable Non-Cartesian Magnetic Resonance Imaging with R2D2
Authors:
Yiwei Chen,
Chao Tang,
Amir Aghabiglou,
Chung San Chu,
Yves Wiaux
Abstract:
We propose a new approach for non-Cartesian magnetic resonance image reconstruction. While unrolled architectures provide robustness via data-consistency layers, embedding measurement operators in Deep Neural Network (DNN) can become impractical at large scale. Alternative Plug-and-Play (PnP) approaches, where the denoising DNNs are blind to the measurement setting, are not affected by this limita…
▽ More
We propose a new approach for non-Cartesian magnetic resonance image reconstruction. While unrolled architectures provide robustness via data-consistency layers, embedding measurement operators in Deep Neural Network (DNN) can become impractical at large scale. Alternative Plug-and-Play (PnP) approaches, where the denoising DNNs are blind to the measurement setting, are not affected by this limitation and have also proven effective, but their highly iterative nature also affects scalability. To address this scalability challenge, we leverage the "Residual-to-Residual DNN series for high-Dynamic range imaging (R2D2)" approach recently introduced in astronomical imaging. R2D2's reconstruction is formed as a series of residual images, iteratively estimated as outputs of DNNs taking the previous iteration's image estimate and associated data residual as inputs. The method can be interpreted as a learned version of the Matching Pursuit algorithm. We demonstrate R2D2 in simulation, considering radial k-space sampling acquisition sequences. Our preliminary results suggest that R2D2 achieves: (i) suboptimal performance compared to its unrolled incarnation R2D2-Net, which is however non-scalable due to the necessary embedding of NUFFT-based data-consistency layers; (ii) superior reconstruction quality to a scalable version of R2D2-Net embedding an FFT-based approximation for data consistency; (iii) superior reconstruction quality to PnP, while only requiring few iterations.
△ Less
Submitted 28 May, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Aligning Large Language Models for Enhancing Psychiatric Interviews through Symptom Delineation and Summarization
Authors:
Jae-hee So,
Joonhwan Chang,
Eunji Kim,
Junho Na,
JiYeon Choi,
Jy-yong Sohn,
Byung-Hoon Kim,
Sang Hui Chu
Abstract:
Recent advancements in Large Language Models (LLMs) have accelerated their usage in various domains. Given the fact that psychiatric interviews are goal-oriented and structured dialogues between the professional interviewer and the interviewee, it is one of the most underexplored areas where LLMs can contribute substantial value. Here, we explore the use of LLMs for enhancing psychiatric interview…
▽ More
Recent advancements in Large Language Models (LLMs) have accelerated their usage in various domains. Given the fact that psychiatric interviews are goal-oriented and structured dialogues between the professional interviewer and the interviewee, it is one of the most underexplored areas where LLMs can contribute substantial value. Here, we explore the use of LLMs for enhancing psychiatric interviews, by analyzing counseling data from North Korean defectors with traumatic events and mental health issues. Specifically, we investigate whether LLMs can (1) delineate the part of the conversation that suggests psychiatric symptoms and name the symptoms, and (2) summarize stressors and symptoms, based on the interview dialogue transcript. Here, the transcript data was labeled by mental health experts for training and evaluation of LLMs. Our experimental results show that appropriately prompted LLMs can achieve high performance on both the symptom delineation task and the summarization task. This research contributes to the nascent field of applying LLMs to psychiatric interview and demonstrates their potential effectiveness in aiding mental health practitioners.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
The R2D2 deep neural network series paradigm for fast precision imaging in radio astronomy
Authors:
Amir Aghabiglou,
Chung San Chu,
Arwa Dabbech,
Yves Wiaux
Abstract:
Radio-interferometric (RI) imaging entails solving high-resolution high-dynamic range inverse problems from large data volumes. Recent image reconstruction techniques grounded in optimization theory have demonstrated remarkable capability for imaging precision, well beyond CLEAN's capability. These range from advanced proximal algorithms propelled by handcrafted regularization operators, such as t…
▽ More
Radio-interferometric (RI) imaging entails solving high-resolution high-dynamic range inverse problems from large data volumes. Recent image reconstruction techniques grounded in optimization theory have demonstrated remarkable capability for imaging precision, well beyond CLEAN's capability. These range from advanced proximal algorithms propelled by handcrafted regularization operators, such as the SARA family, to hybrid plug-and-play (PnP) algorithms propelled by learned regularization denoisers, such as AIRI. Optimization and PnP structures are however highly iterative, which hinders their ability to handle the extreme data sizes expected from future instruments. To address this scalability challenge, we introduce a novel deep learning approach, dubbed "Residual-to-Residual DNN series for high-Dynamic range imaging". R2D2's reconstruction is formed as a series of residual images, iteratively estimated as outputs of Deep Neural Networks (DNNs) taking the previous iteration's image estimate and associated data residual as inputs. It thus takes a hybrid structure between a PnP algorithm and a learned version of the matching pursuit algorithm that underpins CLEAN. We present a comprehensive study of our approach, featuring its multiple incarnations distinguished by their DNN architectures. We provide a detailed description of its training process, targeting a telescope-specific approach. R2D2's capability to deliver high precision is demonstrated in simulation, across a variety of image and observation settings using the Very Large Array (VLA). Its reconstruction speed is also demonstrated: with only few iterations required to clean data residuals at dynamic ranges up to 100000, R2D2 opens the door to fast precision imaging. R2D2 codes are available in the BASPLib library on GitHub.
△ Less
Submitted 1 May, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
CloudTracks: A Dataset for Localizing Ship Tracks in Satellite Images of Clouds
Authors:
Muhammad Ahmed Chaudhry,
Lyna Kim,
Jeremy Irvin,
Yuzu Ido,
Sonia Chu,
Jared Thomas Isobe,
Andrew Y. Ng,
Duncan Watson-Parris
Abstract:
Clouds play a significant role in global temperature regulation through their effect on planetary albedo. Anthropogenic emissions of aerosols can alter the albedo of clouds, but the extent of this effect, and its consequent impact on temperature change, remains uncertain. Human-induced clouds caused by ship aerosol emissions, commonly referred to as ship tracks, provide visible manifestations of t…
▽ More
Clouds play a significant role in global temperature regulation through their effect on planetary albedo. Anthropogenic emissions of aerosols can alter the albedo of clouds, but the extent of this effect, and its consequent impact on temperature change, remains uncertain. Human-induced clouds caused by ship aerosol emissions, commonly referred to as ship tracks, provide visible manifestations of this effect distinct from adjacent cloud regions and therefore serve as a useful sandbox to study human-induced clouds. However, the lack of large-scale ship track data makes it difficult to deduce their general effects on cloud formation. Towards developing automated approaches to localize ship tracks at scale, we present CloudTracks, a dataset containing 3,560 satellite images labeled with more than 12,000 ship track instance annotations. We train semantic segmentation and instance segmentation model baselines on our dataset and find that our best model substantially outperforms previous state-of-the-art for ship track localization (61.29 vs. 48.65 IoU). We also find that the best instance segmentation model is able to identify the number of ship tracks in each image more accurately than the previous state-of-the-art (1.64 vs. 4.99 MAE). However, we identify cases where the best model struggles to accurately localize and count ship tracks, so we believe CloudTracks will stimulate novel machine learning approaches to better detect elongated and overlapping features in satellite images. We release our dataset openly at {zenodo.org/records/10042922}.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Generative Design of Crystal Structures by Point Cloud Representations and Diffusion Model
Authors:
Zhelin Li,
Rami Mrad,
Runxian Jiao,
Guan Huang,
Jun Shan,
Shibing Chu,
Yuanping Chen
Abstract:
Efficiently generating energetically stable crystal structures has long been a challenge in material design, primarily due to the immense arrangement of atoms in a crystal lattice. To facilitate the discovery of stable material, we present a framework for the generation of synthesizable materials, leveraging a point cloud representation to encode intricate structural information. At the heart of t…
▽ More
Efficiently generating energetically stable crystal structures has long been a challenge in material design, primarily due to the immense arrangement of atoms in a crystal lattice. To facilitate the discovery of stable material, we present a framework for the generation of synthesizable materials, leveraging a point cloud representation to encode intricate structural information. At the heart of this framework lies the introduction of a diffusion model as its foundational pillar. To gauge the efficacy of our approach, we employ it to reconstruct input structures from our training datasets, rigorously validating its high reconstruction performance. Furthermore, we demonstrate the profound potential of Point Cloud-Based Crystal Diffusion (PCCD) by generating entirely new materials, emphasizing their synthesizability. Our research stands as a noteworthy contribution to the advancement of materials design and synthesis through the cutting-edge avenue of generative design instead of the conventional substitution or experience-based discovery.
△ Less
Submitted 30 January, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Integrating Graceful Degradation and Recovery through Requirement-driven Adaptation
Authors:
Simon Chu,
Justin Koe,
David Garlan,
Eunsuk Kang
Abstract:
Cyber-physical systems (CPS) are subject to environmental uncertainties such as adverse operating conditions, malicious attacks, and hardware degradation. These uncertainties may lead to failures that put the system in a sub-optimal or unsafe state. Systems that are resilient to such uncertainties rely on two types of operations: (1) graceful degradation, to ensure that the system maintains an acc…
▽ More
Cyber-physical systems (CPS) are subject to environmental uncertainties such as adverse operating conditions, malicious attacks, and hardware degradation. These uncertainties may lead to failures that put the system in a sub-optimal or unsafe state. Systems that are resilient to such uncertainties rely on two types of operations: (1) graceful degradation, to ensure that the system maintains an acceptable level of safety during unexpected environmental conditions and (2) recovery, to facilitate the resumption of normal system functions. Typically, mechanisms for degradation and recovery are developed independently from each other, and later integrated into a system, requiring the designer to develop an additional, ad-hoc logic for activating and coordinating between the two operations. In this paper, we propose a self-adaptation approach for improving system resiliency through automated triggering and coordination of graceful degradation and recovery. The key idea behind our approach is to treat degradation and recovery as requirement-driven adaptation tasks: Degradation can be thought of as temporarily weakening original (i.e., ideal) system requirements to be achieved by the system, and recovery as strengthening the weakened requirements when the environment returns within an expected operating boundary. Furthermore, by treating weakening and strengthening as dual operations, we argue that a single requirement-based adaptation method is sufficient to enable coordination between degradation and recovery. Given system requirements specified in signal temporal logic (STL), we propose a run-time adaptation framework that performs degradation and recovery in response to environmental changes. We describe a prototype implementation of our framework and demonstrate the feasibility of the proposed approach using a case study in unmanned underwater vehicles.
△ Less
Submitted 8 April, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
OkayPlan: Obstacle Kinematics Augmented Dynamic Real-time Path Planning via Particle Swarm Optimization
Authors:
Jinghao Xin,
Jinwoo Kim,
Shengjia Chu,
Ning Li
Abstract:
Existing Global Path Planning (GPP) algorithms predominantly presume planning in static environments. This assumption immensely limits their applications to Unmanned Surface Vehicles (USVs) that typically navigate in dynamic environments. To address this limitation, we present OkayPlan, a GPP algorithm capable of generating safe and short paths in dynamic scenarios at a real-time executing speed (…
▽ More
Existing Global Path Planning (GPP) algorithms predominantly presume planning in static environments. This assumption immensely limits their applications to Unmanned Surface Vehicles (USVs) that typically navigate in dynamic environments. To address this limitation, we present OkayPlan, a GPP algorithm capable of generating safe and short paths in dynamic scenarios at a real-time executing speed (125 Hz on a desktop-class computer). Specifically, we approach the challenge of dynamic obstacle avoidance by formulating the path planning problem as an Obstacle Kinematics Augmented Optimization Problem (OKAOP), which can be efficiently resolved through a PSO-based optimizer at a real-time speed. Meanwhile, a Dynamic Prioritized Initialization (DPI) mechanism that adaptively initializes potential solutions for the optimization problem is established to further ameliorate the solution quality. Additionally, a relaxation strategy that facilitates the autonomous tuning of OkayPlan's hyperparameters in dynamic environments is devised. Comprehensive experiments, including comparative evaluations, ablation studies, and \textcolor{black}{applications to 3D physical simulation platforms}, have been conducted to substantiate the efficacy of our approach. Results indicate that OkayPlan outstrips existing methods in terms of path safety, length optimality, and computational efficiency, establishing it as a potent GPP technique for dynamic environments. The video and code associated with this paper are accessible at https://github.com/XinJingHao/OkayPlan.
△ Less
Submitted 11 April, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
Coordinate-based Neural Network for Fourier Phase Retrieval
Authors:
Tingyou Li,
Zixin Xu,
Yong S. Chu,
Xiaojing Huang,
Jizhou Li
Abstract:
Fourier phase retrieval is essential for high-definition imaging of nanoscale structures across diverse fields, notably coherent diffraction imaging. This study presents the Single impliCit neurAl Network (SCAN), a tool built upon coordinate neural networks meticulously designed for enhanced phase retrieval performance. Remedying the drawbacks of conventional iterative methods which are easiliy tr…
▽ More
Fourier phase retrieval is essential for high-definition imaging of nanoscale structures across diverse fields, notably coherent diffraction imaging. This study presents the Single impliCit neurAl Network (SCAN), a tool built upon coordinate neural networks meticulously designed for enhanced phase retrieval performance. Remedying the drawbacks of conventional iterative methods which are easiliy trapped into local minimum solutions and sensitive to noise, SCAN adeptly connects object coordinates to their amplitude and phase within a unified network in an unsupervised manner. While many existing methods primarily use Fourier magnitude in their loss function, our approach incorporates both the predicted magnitude and phase, enhancing retrieval accuracy. Comprehensive tests validate SCAN's superiority over traditional and other deep learning models regarding accuracy and noise robustness. We also demonstrate that SCAN excels in the ptychography setting.
△ Less
Submitted 8 January, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
"Do it my way!": Impact of Customizations on Trust perceptions in Human-Robot Collaboration
Authors:
Parv Kapoor,
Simon Chu,
Angela Chen
Abstract:
Trust has been shown to be a key factor in effective human-robot collaboration. In the context of assistive robotics, the effect of trust factors on human experience is further pronounced. Personalization of assistive robots is an orthogonal factor positively correlated with robot adoption and user perceptions. In this work, we investigate the relationship between these factors through a within-su…
▽ More
Trust has been shown to be a key factor in effective human-robot collaboration. In the context of assistive robotics, the effect of trust factors on human experience is further pronounced. Personalization of assistive robots is an orthogonal factor positively correlated with robot adoption and user perceptions. In this work, we investigate the relationship between these factors through a within-subjects study (N=17). We provide different levels of customization possibilities over baseline autonomous robot behavior and investigate its impact on trust. Our findings indicate that increased levels of customization was associated with higher trust and comfort perceptions. The assistive robot design process can benefit significantly from our insights for designing trustworthy and customized robots.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Runtime Resolution of Feature Interactions through Adaptive Requirement Weakening
Authors:
Simon Chu,
Emma Shedden,
Changjian Zhang,
Rômulo Meira-Góes,
Gabriel A. Moreno,
David Garlan,
Eunsuk Kang
Abstract:
The feature interaction problem occurs when two or more independently developed components interact with each other in unanticipated ways, resulting in undesirable system behaviors. Feature interaction problems remain a challenge for emerging domains in cyber-physical systems (CPS), such as the Internet of Things and autonomous drones. Existing techniques for resolving feature interactions take a…
▽ More
The feature interaction problem occurs when two or more independently developed components interact with each other in unanticipated ways, resulting in undesirable system behaviors. Feature interaction problems remain a challenge for emerging domains in cyber-physical systems (CPS), such as the Internet of Things and autonomous drones. Existing techniques for resolving feature interactions take a "winner-takes-all" approach, where one out of the conflicting features is selected as the most desirable one, and the rest are disabled. However, when multiple of the conflicting features fulfill important system requirements, being forced to select one of them can result in an undesirable system outcome. In this paper, we propose a new resolution approach that allows all of the conflicting features to continue to partially fulfill their requirements during the resolution process. In particular, our approach leverages the idea of adaptive requirement weakening, which involves one or more features temporarily weakening their level of performance in order to co-exist with the other features in a consistent manner. Given feature requirements specified in Signal Temporal Logic (STL), we propose an automated method and a runtime architecture for automatically weakening the requirements to resolve a conflict. We demonstrate our approach through case studies on feature interactions in autonomous drones.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
CLEANing Cygnus A deep and fast with R2D2
Authors:
Arwa Dabbech,
Amir Aghabiglou,
Chung San Chu,
Yves Wiaux
Abstract:
A novel deep learning paradigm for synthesis imaging by radio interferometry in astronomy was recently proposed, dubbed "Residual-to-Residual DNN series for high-Dynamic range imaging" (R2D2). In this work, we start by shedding light on R2D2's algorithmic structure, interpreting it as a learned version of CLEAN with minor cycles substituted with a deep neural network (DNN) whose training is iterat…
▽ More
A novel deep learning paradigm for synthesis imaging by radio interferometry in astronomy was recently proposed, dubbed "Residual-to-Residual DNN series for high-Dynamic range imaging" (R2D2). In this work, we start by shedding light on R2D2's algorithmic structure, interpreting it as a learned version of CLEAN with minor cycles substituted with a deep neural network (DNN) whose training is iteration-specific. We then proceed with R2D2's first demonstration on real data, for monochromatic intensity imaging of the radio galaxy Cygnus A from S band observations with the Very Large Array (VLA). We show that the modeling power of R2D2's learning approach enables delivering high-precision imaging, superseding the resolution of CLEAN, and matching the precision of modern optimization and plug-and-play algorithms, respectively uSARA and AIRI. Requiring few major-cycle iterations only, R2D2 provides a much faster reconstruction than uSARA and AIRI, known to be highly iterative, and is at least as fast as CLEAN.
△ Less
Submitted 23 April, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
SpecTracle: Wearable Facial Motion Tracking from Unobtrusive Peripheral Cameras
Authors:
Yinan Xuan,
Varun Viswanath,
Sunny Chu,
Owen Bartolf,
Jessica Echterhoff,
Edward Wang
Abstract:
Facial motion tracking in head-mounted displays (HMD) has the potential to enable immersive "face-to-face" interaction in a virtual environment. However, current works on facial tracking are not suitable for unobtrusive augmented reality (AR) glasses or do not have the ability to track arbitrary facial movements. In this work, we demonstrate a novel system called SpecTracle that tracks a user's fa…
▽ More
Facial motion tracking in head-mounted displays (HMD) has the potential to enable immersive "face-to-face" interaction in a virtual environment. However, current works on facial tracking are not suitable for unobtrusive augmented reality (AR) glasses or do not have the ability to track arbitrary facial movements. In this work, we demonstrate a novel system called SpecTracle that tracks a user's facial motions using two wide-angle cameras mounted right next to the visor of a Hololens. Avoiding the usage of cameras extended in front of the face, our system greatly improves the feasibility to integrate full-face tracking into a low-profile form factor. We also demonstrate that a neural network-based model processing the wide-angle cameras can run in real-time at 24 frames per second (fps) on a mobile GPU and track independent facial movement for different parts of the face with a user-independent model. Using a short personalized calibration, the system improves its tracking performance by 42.3% compared to the user-independent model.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Proportionally Representative Clustering
Authors:
Haris Aziz,
Barton E. Lee,
Sean Morota Chu,
Jeremy Vollen
Abstract:
In recent years, there has been a surge in effort to formalize notions of fairness in machine learning. We focus on clustering -- one of the fundamental tasks in unsupervised machine learning. We propose a new axiom ``proportional representation fairness'' (PRF) that is designed for clustering problems where the selection of centroids reflects the distribution of data points and how tightly they a…
▽ More
In recent years, there has been a surge in effort to formalize notions of fairness in machine learning. We focus on clustering -- one of the fundamental tasks in unsupervised machine learning. We propose a new axiom ``proportional representation fairness'' (PRF) that is designed for clustering problems where the selection of centroids reflects the distribution of data points and how tightly they are clustered together. Our fairness concept is not satisfied by existing fair clustering algorithms. We design efficient algorithms to achieve PRF both for unconstrained and discrete clustering problems. Our algorithm for the unconstrained setting is also the first known polynomial-time approximation algorithm for the well-studied Proportional Fairness (PF) axiom (Chen, Fain, Lyu, and Munagala, ICML, 2019). Our algorithm for the discrete setting also matches the best known approximation factor for PF.
△ Less
Submitted 15 August, 2023; v1 submitted 26 April, 2023;
originally announced April 2023.
-
Design of Two-Level Incentive Mechanisms for Hierarchical Federated Learning
Authors:
Shunfeng Chu,
Jun Li,
Kang Wei,
Yuwen Qian,
Kunlun Wang,
Feng Shu,
Wen Chen
Abstract:
Hierarchical Federated Learning (HFL) is a distributed machine learning paradigm tailored for multi-tiered computation architectures, which supports massive access of devices' models simultaneously. To enable efficient HFL, it is crucial to design suitable incentive mechanisms to ensure that devices actively participate in local training. However, there are few studies on incentive mechanism desig…
▽ More
Hierarchical Federated Learning (HFL) is a distributed machine learning paradigm tailored for multi-tiered computation architectures, which supports massive access of devices' models simultaneously. To enable efficient HFL, it is crucial to design suitable incentive mechanisms to ensure that devices actively participate in local training. However, there are few studies on incentive mechanism design for HFL. In this paper, we design two-level incentive mechanisms for the HFL with a two-tiered computing structure to encourage the participation of entities in each tier in the HFL training. In the lower-level game, we propose a coalition formation game to joint optimize the edge association and bandwidth allocation problem, and obtain efficient coalition partitions by the proposed preference rule, which can be proven to be stable by exact potential game. In the upper-level game, we design the Stackelberg game algorithm, which not only determines the optimal number of edge aggregations for edge servers to maximize their utility, but also optimize the unit reward provided for the edge aggregation performance to ensure the interests of cloud servers. Furthermore, numerical results indicate that the proposed algorithms can achieve better performance than the benchmark schemes.
△ Less
Submitted 16 January, 2024; v1 submitted 9 April, 2023;
originally announced April 2023.
-
Matching Algorithms under Diversity-Based Reservations
Authors:
Haris Aziz,
Sean Morota Chu,
Zhaohong Sun
Abstract:
Selection under category or diversity constraints is a ubiquitous and widely-applicable problem that is encountered in immigration, school choice, hiring, and healthcare rationing. These diversity constraints are typically represented by minimum and maximum quotas on various categories or types. We undertake a detailed comparative study of applicant selection algorithms with respect to the diversi…
▽ More
Selection under category or diversity constraints is a ubiquitous and widely-applicable problem that is encountered in immigration, school choice, hiring, and healthcare rationing. These diversity constraints are typically represented by minimum and maximum quotas on various categories or types. We undertake a detailed comparative study of applicant selection algorithms with respect to the diversity goals.
△ Less
Submitted 18 February, 2023;
originally announced February 2023.
-
Self-supervised Learning for Segmentation and Quantification of Dopamine Neurons in Parkinson's Disease
Authors:
Fatemeh Haghighi,
Soumitra Ghosh,
Hai Ngu,
Sarah Chu,
Han Lin,
Mohsen Hejrati,
Baris Bingol,
Somaye Hashemifar
Abstract:
Parkinson's Disease (PD) is the second most common neurodegenerative disease in humans. PD is characterized by the gradual loss of dopaminergic neurons in the Substantia Nigra (SN). Counting the number of dopaminergic neurons in the SN is one of the most important indexes in evaluating drug efficacy in PD animal models. Currently, analyzing and quantifying dopaminergic neurons is conducted manuall…
▽ More
Parkinson's Disease (PD) is the second most common neurodegenerative disease in humans. PD is characterized by the gradual loss of dopaminergic neurons in the Substantia Nigra (SN). Counting the number of dopaminergic neurons in the SN is one of the most important indexes in evaluating drug efficacy in PD animal models. Currently, analyzing and quantifying dopaminergic neurons is conducted manually by experts through analysis of digital pathology images which is laborious, time-consuming, and highly subjective. As such, a reliable and unbiased automated system is demanded for the quantification of dopaminergic neurons in digital pathology images. Recent years have seen a surge in adopting deep learning solutions in medical image processing. However, developing high-performing deep learning models hinges on the availability of large-scale, high-quality annotated data, which can be expensive to acquire, especially in applications like digital pathology image analysis. To this end, we propose an end-to-end deep learning framework based on self-supervised learning for the segmentation and quantification of dopaminergic neurons in PD animal models. To the best of our knowledge, this is the first deep learning model that detects the cell body of dopaminergic neurons, counts the number of dopaminergic neurons, and provides characteristics of individual dopaminergic neurons as a numerical output. Extensive experiments demonstrate the effectiveness of our model in quantifying neurons with high precision, which can provide a faster turnaround for drug efficacy studies, better understanding of dopaminergic neuronal health status, and unbiased results in PD pre-clinical research. As part of our contributions, we also provide the first publicly available dataset of histology digital images along with expert annotations for the segmentation of TH-positive DA neuronal soma.
△ Less
Submitted 12 October, 2023; v1 submitted 11 January, 2023;
originally announced January 2023.
-
Multiclass Semantic Segmentation to Identify Anatomical Sub-Regions of Brain and Measure Neuronal Health in Parkinson's Disease
Authors:
Hosein Barzekar,
Hai Ngu,
Han Hui Lin,
Mohsen Hejrati,
Steven Ray Valdespino,
Sarah Chu,
Baris Bingol,
Somaye Hashemifar,
Soumitra Ghosh
Abstract:
Automated segmentation of anatomical sub-regions with high precision has become a necessity to enable the quantification and characterization of cells/ tissues in histology images. Currently, a machine learning model to analyze sub-anatomical regions of the brain to analyze 2D histological images is not available. The scientists rely on manually segmenting anatomical sub-regions of the brain which…
▽ More
Automated segmentation of anatomical sub-regions with high precision has become a necessity to enable the quantification and characterization of cells/ tissues in histology images. Currently, a machine learning model to analyze sub-anatomical regions of the brain to analyze 2D histological images is not available. The scientists rely on manually segmenting anatomical sub-regions of the brain which is extremely time-consuming and prone to labeler-dependent bias. One of the major challenges in accomplishing such a task is the lack of high-quality annotated images that can be used to train a generic artificial intelligence model. In this study, we employed a UNet-based architecture, compared model performance with various combinations of encoders, image sizes, and sample selection techniques. Additionally, to increase the sample set we resorted to data augmentation which provided data diversity and robust learning. In this study, we trained our best fit model on approximately one thousand annotated 2D brain images stained with Nissl/ Haematoxylin and Tyrosine Hydroxylase enzyme (TH, indicator of dopaminergic neuron viability). The dataset comprises of different animal studies enabling the model to be trained on different datasets. The model effectively is able to detect two sub-regions compacta (SNCD) and reticulata (SNr) in all the images. In spite of limited training data, our best model achieves a mean intersection over union (IOU) of 79% and a mean dice coefficient of 87%. In conclusion, the UNet-based model with EffiecientNet as an encoder outperforms all other encoders, resulting in a first of its kind robust model for multiclass segmentation of sub-brain regions in 2D images.
△ Less
Submitted 7 January, 2023;
originally announced January 2023.
-
Generative Antibody Design for Complementary Chain Pairing Sequences through Encoder-Decoder Language Model
Authors:
Simon K. S. Chu,
Kathy Y. Wei
Abstract:
Current protein language models (pLMs) predominantly focus on single-chain protein sequences and often have not accounted for constraints on generative design imposed by protein-protein interactions. To address this gap, we present paired Antibody T5 (pAbT5), an encoder-decoder model to generate complementary heavy or light chain from its pairing partner. We show that our model respects conservati…
▽ More
Current protein language models (pLMs) predominantly focus on single-chain protein sequences and often have not accounted for constraints on generative design imposed by protein-protein interactions. To address this gap, we present paired Antibody T5 (pAbT5), an encoder-decoder model to generate complementary heavy or light chain from its pairing partner. We show that our model respects conservation in framework regions and variability in hypervariable domains, demonstrated by agreement with sequence alignment and variable-length CDR loops. We also show that our model captures chain pairing preferences through the recovery of ground-truth chain type and gene families. Our results showcase the potential of pAbT5 in generative antibody design, incorporating biological constraints from chain pairing preferences.
△ Less
Submitted 20 November, 2023; v1 submitted 6 January, 2023;
originally announced January 2023.
-
Adore: Differentially Oblivious Relational Database Operators
Authors:
Lianke Qin,
Rajesh Jayaram,
Elaine Shi,
Zhao Song,
Danyang Zhuo,
Shumo Chu
Abstract:
There has been a recent effort in applying differential privacy on memory access patterns to enhance data privacy. This is called differential obliviousness. Differential obliviousness is a promising direction because it provides a principled trade-off between performance and desired level of privacy. To date, it is still an open question whether differential obliviousness can speed up database pr…
▽ More
There has been a recent effort in applying differential privacy on memory access patterns to enhance data privacy. This is called differential obliviousness. Differential obliviousness is a promising direction because it provides a principled trade-off between performance and desired level of privacy. To date, it is still an open question whether differential obliviousness can speed up database processing with respect to full obliviousness. In this paper, we present the design and implementation of three new major database operators: selection with projection, grouping with aggregation, and foreign key join. We prove that they satisfy the notion of differential obliviousness. Our differentially oblivious operators have reduced cache complexity, runtime complexity, and output size compared to their state-of-the-art fully oblivious counterparts. We also demonstrate that our implementation of these differentially oblivious operators can outperform their state-of-the-art fully oblivious counterparts by up to $7.4\times$.
△ Less
Submitted 29 September, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Self-Supervised Intensity-Event Stereo Matching
Authors:
Jinjin Gu,
Jinan Zhou,
Ringo Sai Wo Chu,
Yan Chen,
Jiawei Zhang,
Xuanye Cheng,
Song Zhang,
Jimmy S. Ren
Abstract:
Event cameras are novel bio-inspired vision sensors that output pixel-level intensity changes in microsecond accuracy with a high dynamic range and low power consumption. Despite these advantages, event cameras cannot be directly applied to computational imaging tasks due to the inability to obtain high-quality intensity and events simultaneously. This paper aims to connect a standalone event came…
▽ More
Event cameras are novel bio-inspired vision sensors that output pixel-level intensity changes in microsecond accuracy with a high dynamic range and low power consumption. Despite these advantages, event cameras cannot be directly applied to computational imaging tasks due to the inability to obtain high-quality intensity and events simultaneously. This paper aims to connect a standalone event camera and a modern intensity camera so that the applications can take advantage of both two sensors. We establish this connection through a multi-modal stereo matching task. We first convert events to a reconstructed image and extend the existing stereo networks to this multi-modality condition. We propose a self-supervised method to train the multi-modal stereo network without using ground truth disparity data. The structure loss calculated on image gradients is used to enable self-supervised learning on such multi-modal data. Exploiting the internal stereo constraint between views with different modalities, we introduce general stereo loss functions, including disparity cross-consistency loss and internal disparity loss, leading to improved performance and robustness compared to existing approaches. The experiments demonstrate the effectiveness of the proposed method, especially the proposed general stereo loss functions, on both synthetic and real datasets. At last, we shed light on employing the aligned events and intensity images in downstream tasks, e.g., video interpolation application.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Pooling Revisited: Your Receptive Field is Suboptimal
Authors:
Dong-Hwan Jang,
Sanghyeok Chu,
Joonhyuk Kim,
Bohyung Han
Abstract:
The size and shape of the receptive field determine how the network aggregates local information and affect the overall performance of a model considerably. Many components in a neural network, such as kernel sizes and strides for convolution and pooling operations, influence the configuration of a receptive field. However, they still rely on hyperparameters, and the receptive fields of existing m…
▽ More
The size and shape of the receptive field determine how the network aggregates local information and affect the overall performance of a model considerably. Many components in a neural network, such as kernel sizes and strides for convolution and pooling operations, influence the configuration of a receptive field. However, they still rely on hyperparameters, and the receptive fields of existing models result in suboptimal shapes and sizes. Hence, we propose a simple yet effective Dynamically Optimized Pooling operation, referred to as DynOPool, which optimizes the scale factors of feature maps end-to-end by learning the desirable size and shape of its receptive field in each layer. Any kind of resizing modules in a deep neural network can be replaced by the operations with DynOPool at a minimal cost. Also, DynOPool controls the complexity of a model by introducing an additional loss term that constrains computational cost. Our experiments show that the models equipped with the proposed learnable resizing module outperform the baseline networks on multiple datasets in image classification and semantic segmentation.
△ Less
Submitted 29 June, 2022; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Research on Wearable Technologies for Learning: A Systematic Review
Authors:
Sharon Lynn Chu,
Brittany M. Garcia,
Neha Rani
Abstract:
A good amount of research has explored the use of wearables for educational or learning purposes. We have now reached a point when much literature can be found on that topic, but few attempts have been made to make sense of that literature from a holistic perspective. This paper presents a systematic review of the literature on wearables for learning. Literature was sourced from conferences and jo…
▽ More
A good amount of research has explored the use of wearables for educational or learning purposes. We have now reached a point when much literature can be found on that topic, but few attempts have been made to make sense of that literature from a holistic perspective. This paper presents a systematic review of the literature on wearables for learning. Literature was sourced from conferences and journals pertaining to technology and education, and through an ad hoc search. Our review focuses on identifying the ways that wearables have been used to support learning and provides perspectives on that issue from a historical dimension, and with regards to the types of wearables used, the populations targeted, and the settings addressed. Seven different ways of how wearables have been used to support learning were identified. We propose a framework identifying five main components that have been addressed in existing research on how wearables can support learning and present our interpretations of unaddressed research directions based on our review results.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Robust topology optimization of structures under uncertain propagation of imprecise stochastic-based uncertain field
Authors:
Kang Gao,
Duy Minh Doc,
Sheng Chu,
Gang Wu,
H. Alicia Kim,
Carol A. Featherston
Abstract:
This study introduces a novel computational framework for Robust Topology Optimization (RTO) considering imprecise random field parameters. Unlike the worst-case approach, the present method provides upper and lower bounds for the mean and standard deviation of compliance as well as the optimized topological layouts of a structure for various scenarios. In the proposed approach, the imprecise rand…
▽ More
This study introduces a novel computational framework for Robust Topology Optimization (RTO) considering imprecise random field parameters. Unlike the worst-case approach, the present method provides upper and lower bounds for the mean and standard deviation of compliance as well as the optimized topological layouts of a structure for various scenarios. In the proposed approach, the imprecise random field variables are determined utilizing parameterized p-boxes with different confidence intervals. The Karhunen-Loève (K-L) expansion is extended to provide a spectral description of the imprecise random field. The linear superposition method in conjunction with a linear combination of orthogonal functions is employed to obtain explicit mathematical expressions for the first and second order statistical moments of the structural compliance. Then, an interval sensitivity analysis is carried out, applying the Orthogonal Similarity Transformation (OST) method with the boundaries of each of the intermediate variable searched efficiently at every iteration using a Combinatorial Approach (CA). Finally, the validity, accuracy, and applicability of the work are rigorously checked by comparing the outputs of the proposed approach with those obtained using the particle swarm optimization (PSO) and Quasi-Monte-Carlo Simulation (QMCS) methods. Three different numerical examples with imprecise random field loads are presented to show the effectiveness and feasibility of the study.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Learning Debiased and Disentangled Representations for Semantic Segmentation
Authors:
Sanghyeok Chu,
Dongwan Kim,
Bohyung Han
Abstract:
Deep neural networks are susceptible to learn biased models with entangled feature representations, which may lead to subpar performances on various downstream tasks. This is particularly true for under-represented classes, where a lack of diversity in the data exacerbates the tendency. This limitation has been addressed mostly in classification tasks, but there is little study on additional chall…
▽ More
Deep neural networks are susceptible to learn biased models with entangled feature representations, which may lead to subpar performances on various downstream tasks. This is particularly true for under-represented classes, where a lack of diversity in the data exacerbates the tendency. This limitation has been addressed mostly in classification tasks, but there is little study on additional challenges that may appear in more complex dense prediction problems including semantic segmentation. To this end, we propose a model-agnostic and stochastic training scheme for semantic segmentation, which facilitates the learning of debiased and disentangled representations. For each class, we first extract class-specific information from the highly entangled feature map. Then, information related to a randomly sampled class is suppressed by a feature selection process in the feature space. By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes, and the model is able to learn more debiased and disentangled feature representations. Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks, with especially notable performance gains on under-represented classes.
△ Less
Submitted 31 October, 2021;
originally announced November 2021.
-
Federated Learning Over Wireless Channels: Dynamic Resource Allocation and Task Scheduling
Authors:
Shunfeng Chu,
Jun Li,
Jianxin Wang,
Zhe Wang,
Ming Ding,
Yijin Zang,
Yuwen Qian,
Wen Chen
Abstract:
With the development of federated learning (FL), mobile devices (MDs) are able to train their local models with private data and sends them to a central server for aggregation, thereby preventing sensitive raw data leakage. In this paper, we aim to improve the training performance of FL systems in the context of wireless channels and stochastic energy arrivals of MDs. To this purpose, we dynamical…
▽ More
With the development of federated learning (FL), mobile devices (MDs) are able to train their local models with private data and sends them to a central server for aggregation, thereby preventing sensitive raw data leakage. In this paper, we aim to improve the training performance of FL systems in the context of wireless channels and stochastic energy arrivals of MDs. To this purpose, we dynamically optimize MDs' transmission power and training task scheduling. We first model this dynamic programming problem as a constrained Markov decision process (CMDP). Due to high dimensions rooted from our CMDP problem, we propose online stochastic learning methods to simplify the CMDP and design online algorithms to obtain an efficient policy for all MDs. Since there are long-term constraints in our CMDP, we utilize Lagrange multipliers approach to tackle this issue. Furthermore, we prove the convergence of the proposed online stochastic learning algorithm. Numerical results indicate that the proposed algorithms can achieve better performance than the benchmark algorithms.
△ Less
Submitted 13 June, 2021;
originally announced June 2021.
-
NeuSE: A Neural Snapshot Ensemble Method for Collaborative Filtering
Authors:
Dongsheng Li,
Haodong Liu,
Chao Chen,
Yingying Zhao,
Stephen M. Chu,
Bo Yang
Abstract:
In collaborative filtering (CF) algorithms, the optimal models are usually learned by globally minimizing the empirical risks averaged over all the observed data. However, the global models are often obtained via a performance tradeoff among users/items, i.e., not all users/items are perfectly fitted by the global models due to the hard non-convex optimization problems in CF algorithms. Ensemble l…
▽ More
In collaborative filtering (CF) algorithms, the optimal models are usually learned by globally minimizing the empirical risks averaged over all the observed data. However, the global models are often obtained via a performance tradeoff among users/items, i.e., not all users/items are perfectly fitted by the global models due to the hard non-convex optimization problems in CF algorithms. Ensemble learning can address this issue by learning multiple diverse models but usually suffer from efficiency issue on large datasets or complex algorithms. In this paper, we keep the intermediate models obtained during global model learning as the snapshot models, and then adaptively combine the snapshot models for individual user-item pairs using a memory network-based method. Empirical studies on three real-world datasets show that the proposed method can extensively and significantly improve the accuracy (up to 15.9% relatively) when applied to a variety of existing collaborative filtering methods.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
11 TeraFLOPs per second photonic convolutional accelerator for deep learning optical neural networks
Authors:
Xingyuan Xu,
Mengxi Tan,
Bill Corcoran,
Jiayang Wu,
Andreas Boes,
Thach G. Nguyen,
Sai T. Chu,
Brent E. Little,
Damien G. Hicks,
Roberto Morandotti,
Arnan Mitchell,
David J. Moss
Abstract:
Convolutional neural networks (CNNs), inspired by biological visual cortex systems, are a powerful category of artificial neural networks that can extract the hierarchical features of raw data to greatly reduce the network parametric complexity and enhance the predicting accuracy. They are of significant interest for machine learning tasks such as computer vision, speech recognition, playing board…
▽ More
Convolutional neural networks (CNNs), inspired by biological visual cortex systems, are a powerful category of artificial neural networks that can extract the hierarchical features of raw data to greatly reduce the network parametric complexity and enhance the predicting accuracy. They are of significant interest for machine learning tasks such as computer vision, speech recognition, playing board games and medical diagnosis. Optical neural networks offer the promise of dramatically accelerating computing speed to overcome the inherent bandwidth bottleneck of electronics. Here, we demonstrate a universal optical vector convolutional accelerator operating beyond 10 TeraFLOPS (floating point operations per second), generating convolutions of images of 250,000 pixels with 8 bit resolution for 10 kernels simultaneously, enough for facial image recognition. We then use the same hardware to sequentially form a deep optical CNN with ten output neurons, achieving successful recognition of full 10 digits with 900 pixel handwritten digit images with 88% accuracy. Our results are based on simultaneously interleaving temporal, wavelength and spatial dimensions enabled by an integrated microcomb source. This approach is scalable and trainable to much more complex networks for demanding applications such as unmanned vehicle and real-time video recognition.
△ Less
Submitted 14 November, 2020;
originally announced November 2020.
-
Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An Overview
Authors:
Alena Butryna,
Shan-Hui Cathy Chu,
Isin Demirsahin,
Alexander Gutkin,
Linne Ha,
Fei He,
Martin Jansche,
Cibu Johny,
Anna Katanova,
Oddur Kjartansson,
Chenfang Li,
Tatiana Merkulova,
Yin May Oo,
Knot Pipatsrisawat,
Clara Rivera,
Supheakmungkol Sarin,
Pasindu de Silva,
Keshan Sodimana,
Richard Sproat,
Theeraphol Wattanavekin,
Jaka Aris Eko Wibawa
Abstract:
This paper presents an overview of a program designed to address the growing need for developing freely available speech resources for under-represented languages. At present we have released 38 datasets for building text-to-speech and automatic speech recognition applications for languages and dialects of South and Southeast Asia, Africa, Europe and South America. The paper describes the methodol…
▽ More
This paper presents an overview of a program designed to address the growing need for developing freely available speech resources for under-represented languages. At present we have released 38 datasets for building text-to-speech and automatic speech recognition applications for languages and dialects of South and Southeast Asia, Africa, Europe and South America. The paper describes the methodology used for developing such corpora and presents some of our findings that could benefit under-represented language communities.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Learn by Observation: Imitation Learning for Drone Patrolling from Videos of A Human Navigator
Authors:
Yue Fan,
Shilei Chu,
Wei Zhang,
Ran Song,
Yibin Li
Abstract:
We present an imitation learning method for autonomous drone patrolling based only on raw videos. Different from previous methods, we propose to let the drone learn patrolling in the air by observing and imitating how a human navigator does it on the ground. The observation process enables the automatic collection and annotation of data using inter-frame geometric consistency, resulting in less ma…
▽ More
We present an imitation learning method for autonomous drone patrolling based only on raw videos. Different from previous methods, we propose to let the drone learn patrolling in the air by observing and imitating how a human navigator does it on the ground. The observation process enables the automatic collection and annotation of data using inter-frame geometric consistency, resulting in less manual effort and high accuracy. Then a newly designed neural network is trained based on the annotated data to predict appropriate directions and translations for the drone to patrol in a lane-keeping manner as humans. Our method allows the drone to fly at a high altitude with a broad view and low risk. It can also detect all accessible directions at crossroads and further carry out the integration of available user instructions and autonomous patrolling control commands. Extensive experiments are conducted to demonstrate the accuracy of the proposed imitating learning process as well as the reliability of the holistic system for autonomous drone navigation. The codes, datasets as well as video demonstrations are available at https://vsislab.github.io/uavpatrol
△ Less
Submitted 30 August, 2020;
originally announced August 2020.
-
Implementing a Fast Unbounded Quantum Fanout Gate Using Power-Law Interactions
Authors:
Andrew Y. Guo,
Abhinav Deshpande,
Su-Kuan Chu,
Zachary Eldredge,
Przemyslaw Bienias,
Dhruv Devulapalli,
Yuan Su,
Andrew M. Childs,
Alexey V. Gorshkov
Abstract:
The standard circuit model for quantum computation presumes the ability to directly perform gates between arbitrary pairs of qubits, which is unlikely to be practical for large-scale experiments. Power-law interactions with strength decaying as $1/r^α$ in the distance $r$ provide an experimentally realizable resource for information processing, whilst still retaining long-range connectivity. We le…
▽ More
The standard circuit model for quantum computation presumes the ability to directly perform gates between arbitrary pairs of qubits, which is unlikely to be practical for large-scale experiments. Power-law interactions with strength decaying as $1/r^α$ in the distance $r$ provide an experimentally realizable resource for information processing, whilst still retaining long-range connectivity. We leverage the power of these interactions to implement a fast quantum fanout gate with an arbitrary number of targets. Our implementation allows the quantum Fourier transform (QFT) and Shor's algorithm to be performed on a $D$-dimensional lattice in time logarithmic in the number of qubits for interactions with $α\le D$. As a corollary, we show that power-law systems with $α\le D$ are difficult to simulate classically even for short times, under a standard assumption that factoring is classically intractable. Complementarily, we develop a new technique to give a general lower bound, linear in the size of the system, on the time required to implement the QFT and the fanout gate in systems that are constrained by a linear light cone. This allows us to prove an asymptotically tighter lower bound for long-range systems than is possible with previously available techniques.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
MVIN: Learning Multiview Items for Recommendation
Authors:
Chang-You Tai,
Meng-Ru Wu,
Yun-Wei Chu,
Shao-Yu Chu,
Lun-Wei Ku
Abstract:
Researchers have begun to utilize heterogeneous knowledge graphs (KGs) as auxiliary information in recommendation systems to mitigate the cold start and sparsity issues. However, utilizing a graph neural network (GNN) to capture information in KG and further apply in RS is still problematic as it is unable to see each item's properties from multiple perspectives. To address these issues, we propos…
▽ More
Researchers have begun to utilize heterogeneous knowledge graphs (KGs) as auxiliary information in recommendation systems to mitigate the cold start and sparsity issues. However, utilizing a graph neural network (GNN) to capture information in KG and further apply in RS is still problematic as it is unable to see each item's properties from multiple perspectives. To address these issues, we propose the multi-view item network (MVIN), a GNN-based recommendation model which provides superior recommendations by describing items from a unique mixed view from user and entity angles. MVIN learns item representations from both the user view and the entity view. From the user view, user-oriented modules score and aggregate features to make recommendations from a personalized perspective constructed according to KG entities which incorporates user click information. From the entity view, the mixing layer contrasts layer-wise GCN information to further obtain comprehensive features from internal entity-entity interactions in the KG. We evaluate MVIN on three real-world datasets: MovieLens-1M (ML-1M), LFM-1b 2015 (LFM-1b), and Amazon-Book (AZ-book). Results show that MVIN significantly outperforms state-of-the-art methods on these three datasets. In addition, from user-view cases, we find that MVIN indeed captures entities that attract users. Figures further illustrate that mixing layers in a heterogeneous KG plays a vital role in neighborhood information aggregation.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
Single photonic perceptron based on a soliton crystal Kerr microcomb for high-speed, scalable, optical neural networks
Authors:
Xingyuan Xu,
Mengxi Tan,
Bill Corcoran,
Jiayang Wu,
Thach G. Nguyen,
Andreas Boes,
Sai T. Chu,
Brent E. Little,
Roberto Morandotti,
Arnan Mitchell,
Damien G. Hicks,
David J. Moss
Abstract:
Optical artificial neural networks (ONNs), analog computing hardware tailored for machine learning, have significant potential for ultra-high computing speed and energy efficiency. We propose a new approach to architectures for ONNs based on integrated Kerr micro-comb sources that is programmable, highly scalable and capable of reaching ultra-high speeds. We experimentally demonstrate the building…
▽ More
Optical artificial neural networks (ONNs), analog computing hardware tailored for machine learning, have significant potential for ultra-high computing speed and energy efficiency. We propose a new approach to architectures for ONNs based on integrated Kerr micro-comb sources that is programmable, highly scalable and capable of reaching ultra-high speeds. We experimentally demonstrate the building block of the ONN, a single neuron perceptron, by mapping synapses onto 49 wavelengths of a micro-comb to achieve a high single-unit throughput of 11.9 Giga-FLOPS at 8 bits per FLOP, corresponding to 95.2 Gbps. We test the perceptron on simple standard benchmark datasets, handwritten-digit recognition and cancer-cell detection, achieving over 90% and 85% accuracy, respectively. This performance is a direct result of the record small wavelength spacing (49GHz) for a coherent integrated microcomb source, which results in an unprecedented number of wavelengths for neuromorphic optics. Finally, we propose an approach to scaling the perceptron to a deep learning network using the same single micro-comb device and standard off-the-shelf telecommunications technology, for high-throughput operation involving full matrix multiplication for applications such as real-time massive data processing for unmanned vehicle and aircraft tracking.
△ Less
Submitted 3 March, 2020;
originally announced March 2020.
-
Learning Numeral Embeddings
Authors:
Chengyue Jiang,
Zhonglin Nian,
Kaihao Guo,
Shanbo Chu,
Yinggong Zhao,
Libin Shen,
Kewei Tu
Abstract:
Word embedding is an essential building block for deep learning methods for natural language processing. Although word embedding has been extensively studied over the years, the problem of how to effectively embed numerals, a special subset of words, is still underexplored. Existing word embedding methods do not learn numeral embeddings well because there are an infinite number of numerals and the…
▽ More
Word embedding is an essential building block for deep learning methods for natural language processing. Although word embedding has been extensively studied over the years, the problem of how to effectively embed numerals, a special subset of words, is still underexplored. Existing word embedding methods do not learn numeral embeddings well because there are an infinite number of numerals and their individual appearances in training corpora are highly scarce. In this paper, we propose two novel numeral embedding methods that can handle the out-of-vocabulary (OOV) problem for numerals. We first induce a finite set of prototype numerals using either a self-organizing map or a Gaussian mixture model. We then represent the embedding of a numeral as a weighted average of the prototype number embeddings. Numeral embeddings represented in this manner can be plugged into existing word embedding learning approaches such as skip-gram for training. We evaluated our methods and showed its effectiveness on four intrinsic and extrinsic tasks: word similarity, embedding numeracy, numeral prediction, and sequence labeling.
△ Less
Submitted 11 January, 2020; v1 submitted 27 December, 2019;
originally announced January 2020.
-
GraphSW: a training protocol based on stage-wise training for GNN-based Recommender Model
Authors:
Chang-You Tai,
Meng-Ru Wu,
Yun-Wei Chu,
Shao-Yu Chu
Abstract:
Recently, researchers utilize Knowledge Graph (KG) as side information in recommendation system to address cold start and sparsity issue and improve the recommendation performance. Existing KG-aware recommendation model use the feature of neighboring entities and structural information to update the embedding of currently located entity. Although the fruitful information is beneficial to the follo…
▽ More
Recently, researchers utilize Knowledge Graph (KG) as side information in recommendation system to address cold start and sparsity issue and improve the recommendation performance. Existing KG-aware recommendation model use the feature of neighboring entities and structural information to update the embedding of currently located entity. Although the fruitful information is beneficial to the following task, the cost of exploring the entire graph is massive and impractical. In order to reduce the computational cost and maintain the pattern of extracting features, KG-aware recommendation model usually utilize fixed-size and random set of neighbors rather than complete information in KG. Nonetheless, there are two critical issues in these approaches: First of all, fixed-size and randomly selected neighbors restrict the view of graph. In addition, as the order of graph feature increases, the growth of parameter dimensionality of the model may lead the training process hard to converge. To solve the aforementioned limitations, we propose GraphSW, a strategy based on stage-wise training framework which would only access to a subset of the entities in KG in every stage. During the following stages, the learned embedding from previous stages is provided to the network in the next stage and the model can learn the information gradually from the KG. We apply stage-wise training on two SOTA recommendation models, RippleNet and Knowledge Graph Convolutional Networks (KGCN). Moreover, we evaluate the performance on six real world datasets, Last.FM 2011, Book-Crossing,movie, LFM-1b 2015, Amazon-book and Yelp 2018. The result of our experiments shows that proposed strategy can help both models to collect more information from the KG and improve the performance. Furthermore, it is observed that GraphSW can assist KGCN to converge effectively in high-order graph feature.
△ Less
Submitted 19 August, 2019; v1 submitted 13 August, 2019;
originally announced August 2019.
-
Convolution Based Spectral Partitioning Architecture for Hyperspectral Image Classification
Authors:
Ringo S. W. Chu,
Ho-Cheung Ng,
Xiwei Wang,
Wayne Luk
Abstract:
Hyperspectral images (HSIs) can distinguish materials with high number of spectral bands, which is widely adopted in remote sensing applications and benefits in high accuracy land cover classifications. However, HSIs processing are tangled with the problem of high dimensionality and limited amount of labelled data. To address these challenges, this paper proposes a deep learning architecture using…
▽ More
Hyperspectral images (HSIs) can distinguish materials with high number of spectral bands, which is widely adopted in remote sensing applications and benefits in high accuracy land cover classifications. However, HSIs processing are tangled with the problem of high dimensionality and limited amount of labelled data. To address these challenges, this paper proposes a deep learning architecture using three dimensional convolutional neural networks with spectral partitioning to perform effective feature extraction. We conduct experiments using Indian Pines and Salinas scenes acquired by NASA Airborne Visible/Infra-Red Imaging Spectrometer. In comparison to prior results, our architecture shows competitive performance for classification results over current methods.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
Optimizing CNN-based Hyperspectral Image Classification on FPGAs
Authors:
Shuanglong Liu,
Ringo S. W. Chu,
Xiwei Wang,
Wayne Luk
Abstract:
Hyperspectral image (HSI) classification has been widely adopted in applications involving remote sensing imagery analysis which require high classification accuracy and real-time processing speed. Methods based on Convolutional neural networks (CNNs) have been proven to achieve state-of-the-art accuracy in classifying HSIs. However, CNN models are often too computationally intensive to achieve re…
▽ More
Hyperspectral image (HSI) classification has been widely adopted in applications involving remote sensing imagery analysis which require high classification accuracy and real-time processing speed. Methods based on Convolutional neural networks (CNNs) have been proven to achieve state-of-the-art accuracy in classifying HSIs. However, CNN models are often too computationally intensive to achieve real-time response due to the high dimensional nature of HSI, compared to traditional methods such as Support Vector Machines (SVMs). Besides, previous CNN models used in HSI are not specially designed for efficient implementation on embedded devices such as FPGAs. This paper proposes a novel CNN-based algorithm for HSI classification which takes into account hardware efficiency. A customized architecture which enables the proposed algorithm to be mapped effectively onto FPGA resources is then proposed to support real-time on-board classification with low power consumption. Implementation results show that our proposed accelerator on a Xilinx Zynq 706 FPGA board achieves more than 70x faster than an Intel 8-core Xeon CPU and 3x faster than an NVIDIA GeForce 1080 GPU. Compared to previous SVM-based FPGA accelerators, we achieve comparable processing speed but provide a much higher classification accuracy.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
NeuralDrop: DNN-based Simulation of Small-Scale Liquid Flows on Solids
Authors:
Rajaditya Mukherjee,
Qingyang Li,
Zhili Chen,
Shicheng Chu,
Huamin Wang
Abstract:
Small-scale liquid flows on solid surfaces provide convincing details in liquid animation, but they are difficult to be simulated with efficiency and fidelity, mostly due to the complex nature of the surface tension at the contact front where liquid, air, and solid meet. In this paper, we propose to simulate the dynamics of new liquid drops from captured real-world liquid flow data, using deep neu…
▽ More
Small-scale liquid flows on solid surfaces provide convincing details in liquid animation, but they are difficult to be simulated with efficiency and fidelity, mostly due to the complex nature of the surface tension at the contact front where liquid, air, and solid meet. In this paper, we propose to simulate the dynamics of new liquid drops from captured real-world liquid flow data, using deep neural networks. To achieve this goal, we develop a data capture system that acquires liquid flow patterns from hundreds of real-world water drops. We then convert raw data into compact data for training neural networks, in which liquid drops are represented by their contact fronts in a Lagrangian form. Using the LSTM units based on recurrent neural networks, our neural networks serve three purposes in our simulator: predicting the contour of a contact front, predicting the color field gradient of a contact front, and finally predicting whether a contact front is going to break or not. Using these predictions, our simulator recovers the overall shape of a liquid drop at every time step, and handles merging and splitting events by simple operations. The experiment shows that our trained neural networks are able to perform predictions well. The whole simulator is robust, convenient to use, and capable of generating realistic small-scale liquid effects in animation.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
Collaborative Filtering with Stability
Authors:
Dongsheng Li,
Chao Chen,
Qin Lv,
Junchi Yan,
Li Shang,
Stephen M. Chu
Abstract:
Collaborative filtering (CF) is a popular technique in today's recommender systems, and matrix approximation-based CF methods have achieved great success in both rating prediction and top-N recommendation tasks. However, real-world user-item rating matrices are typically sparse, incomplete and noisy, which introduce challenges to the algorithm stability of matrix approximation, i.e., small changes…
▽ More
Collaborative filtering (CF) is a popular technique in today's recommender systems, and matrix approximation-based CF methods have achieved great success in both rating prediction and top-N recommendation tasks. However, real-world user-item rating matrices are typically sparse, incomplete and noisy, which introduce challenges to the algorithm stability of matrix approximation, i.e., small changes in the training data may significantly change the models. As a result, existing matrix approximation solutions yield low generalization performance, exhibiting high error variance on the training data, and minimizing the training error may not guarantee error reduction on the test data. This paper investigates the algorithm stability problem of matrix approximation methods and how to achieve stable collaborative filtering via stable matrix approximation. We present a new algorithm design framework, which (1) introduces new optimization objectives to guide stable matrix approximation algorithm design, and (2) solves the optimization problem to obtain stable approximation solutions with good generalization performance. Experimental results on real-world datasets demonstrate that the proposed method can achieve better accuracy compared with state-of-the-art matrix approximation methods and ensemble methods in both rating prediction and top-N recommendation tasks.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
Challenges, Designs, and Performances of a Distributed Algorithm for Minimum-Latency of Data-Aggregation in Multi-Channel WSNs
Authors:
Ngoc-Tu Nguyen,
Bing-Hong Liu,
Shao-I Chu,
Hao-Zhe Weng
Abstract:
In wireless sensor networks (WSNs), the sensed data by sensors need to be gathered, so that one very important application is periodical data collection. There is much effort which aimed at the data collection scheduling algorithm development to minimize the latency. Most of previous works investigating the minimum latency of data collection issue have an ideal assumption that the network is a cen…
▽ More
In wireless sensor networks (WSNs), the sensed data by sensors need to be gathered, so that one very important application is periodical data collection. There is much effort which aimed at the data collection scheduling algorithm development to minimize the latency. Most of previous works investigating the minimum latency of data collection issue have an ideal assumption that the network is a centralized system, in which the entire network is completely synchronized with full knowledge of components. In addition, most of existing works often assume that any (or no) data in the network are allowed to be aggregated into one packet and the network models are often treated as tree structures. However, in practical, WSNs are more likely to be distributed systems, since each sensor's knowledge is disjointed to each other, and a fixed number of data are allowed to to be aggregated into one packet. This is a formidable motivation for us to investigate the problem of minimum latency for the data aggregation without data collision in the distributed WSNs when the sensors are considered to be assigned the channels and the data are compressed with a flexible aggregation ratio, termed the minimum-latency collision-avoidance multiple-data-aggregation scheduling with multi-channel (MLCAMDAS-MC) problem. A new distributed algorithm, termed the distributed collision-avoidance scheduling (DCAS) algorithm, is proposed to address the MLCAMDAS-MC. Finally, we provide the theoretical analyses of DCAS and conduct extensive simulations to demonstrate the performance of DCAS.
△ Less
Submitted 29 October, 2018;
originally announced October 2018.
-
The Curses of Blockchain Decentralization
Authors:
Shumo Chu,
Sophia Wang
Abstract:
Decentralization, which has backed the hyper growth of many blockchains, comes at the cost of scalability. To understand this fundamental limitation, this paper proposes a quantitative measure of blockchain decentralization, and discusses its implications to various trust models and consensus algorithms. Further, we identify the major challenges in blockchain decentralization. Our key findings are…
▽ More
Decentralization, which has backed the hyper growth of many blockchains, comes at the cost of scalability. To understand this fundamental limitation, this paper proposes a quantitative measure of blockchain decentralization, and discusses its implications to various trust models and consensus algorithms. Further, we identify the major challenges in blockchain decentralization. Our key findings are that true decentralization is hard to achieve due to the skewed mining power and that a fully decentralized blockchain inherently limits scalability as it incurs a throughput upper bound and prevents scaling smart contract execution. To address these challenges, we outline three research directions to explore the trade-offs between decentralization and scalability.
△ Less
Submitted 6 October, 2018;
originally announced October 2018.
-
High-Performance Multi-Mode Ptychography Reconstruction on Distributed GPUs
Authors:
Zhihua Dong,
Yao-Lung L. Fang,
Xiaojing Huang,
Hanfei Yan,
Sungsoo Ha,
Wei Xu,
Yong S. Chu,
Stuart I. Campbell,
Meifeng Lin
Abstract:
Ptychography is an emerging imaging technique that is able to provide wavelength-limited spatial resolution from specimen with extended lateral dimensions. As a scanning microscopy method, a typical two-dimensional image requires a number of data frames. As a diffraction-based imaging technique, the real-space image has to be recovered through iterative reconstruction algorithms. Due to these two…
▽ More
Ptychography is an emerging imaging technique that is able to provide wavelength-limited spatial resolution from specimen with extended lateral dimensions. As a scanning microscopy method, a typical two-dimensional image requires a number of data frames. As a diffraction-based imaging technique, the real-space image has to be recovered through iterative reconstruction algorithms. Due to these two inherent aspects, a ptychographic reconstruction is generally a computation-intensive and time-consuming process, which limits the throughput of this method. We report an accelerated version of the multi-mode difference map algorithm for ptychography reconstruction using multiple distributed GPUs. This approach leverages available scientific computing packages in Python, including mpi4py and PyCUDA, with the core computation functions implemented in CUDA C. We find that interestingly even with MPI collective communications, the weak scaling in the number of GPU nodes can still remain nearly constant. Most importantly, for realistic diffraction measurements, we observe a speedup ranging from a factor of $10$ to $10^3$ depending on the data size, which reduces the reconstruction time remarkably from hours to typically about 1 minute and is thus critical for real-time data processing and visualization.
△ Less
Submitted 30 August, 2018;
originally announced August 2018.
-
Axiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries
Authors:
Shumo Chu,
Brendan Murphy,
Jared Roesch,
Alvin Cheung,
Dan Suciu
Abstract:
Deciding the equivalence of SQL queries is a fundamental problem in data management. As prior work has mainly focused on studying the theoretical limitations of the problem, very few implementations for checking such equivalences exist. In this paper, we present a new formalism and implementation for reasoning about the equivalences of SQL queries. Our formalism, U-semiring, extends SQL's semiring…
▽ More
Deciding the equivalence of SQL queries is a fundamental problem in data management. As prior work has mainly focused on studying the theoretical limitations of the problem, very few implementations for checking such equivalences exist. In this paper, we present a new formalism and implementation for reasoning about the equivalences of SQL queries. Our formalism, U-semiring, extends SQL's semiring semantics with unbounded summation and duplicate elimination. U-semiring is defined using only very few axioms and can thus be easily implemented using proof assistants such as Coq for automated query reasoning. Yet, they are sufficient enough to enable us reason about sophisticated SQL queries that are evaluated over bags and sets, along with various integrity constraints. To evaluate the effectiveness of U-semiring, we have used it to formally verify 39 query rewrite rules from both classical data management research papers and real-world SQL engines, where many of them have never been proven correct before.
△ Less
Submitted 23 May, 2018; v1 submitted 6 February, 2018;
originally announced February 2018.
-
Modeling The Intensity Function Of Point Process Via Recurrent Neural Networks
Authors:
Shuai Xiao,
Junchi Yan,
Stephen M. Chu,
Xiaokang Yang,
Hongyuan Zha
Abstract:
Event sequence, asynchronously generated with random timestamp, is ubiquitous among applications. The precise and arbitrary timestamp can carry important clues about the underlying dynamics, and has lent the event data fundamentally different from the time-series whereby series is indexed with fixed and equal time interval. One expressive mathematical tool for modeling event is point process. The…
▽ More
Event sequence, asynchronously generated with random timestamp, is ubiquitous among applications. The precise and arbitrary timestamp can carry important clues about the underlying dynamics, and has lent the event data fundamentally different from the time-series whereby series is indexed with fixed and equal time interval. One expressive mathematical tool for modeling event is point process. The intensity functions of many point processes involve two components: the background and the effect by the history. Due to its inherent spontaneousness, the background can be treated as a time series while the other need to handle the history events. In this paper, we model the background by a Recurrent Neural Network (RNN) with its units aligned with time series indexes while the history effect is modeled by another RNN whose units are aligned with asynchronous events to capture the long-range dynamics. The whole model with event type and timestamp prediction output layers can be trained end-to-end. Our approach takes an RNN perspective to point process, and models its background and history effect. For utility, our method allows a black-box treatment for modeling the intensity which is often a pre-defined parametric form in point processes. Meanwhile end-to-end training opens the venue for reusing existing rich techniques in deep network for point process modeling. We apply our model to the predictive maintenance problem using a log dataset by more than 1000 ATMs from a global bank headquartered in North America.
△ Less
Submitted 24 May, 2017;
originally announced May 2017.
-
Latent Dependency Forest Models
Authors:
Shanbo Chu,
Yong Jiang,
Kewei Tu
Abstract:
Probabilistic modeling is one of the foundations of modern machine learning and artificial intelligence. In this paper, we propose a novel type of probabilistic models named latent dependency forest models (LDFMs). A LDFM models the dependencies between random variables with a forest structure that can change dynamically based on the variable values. It is therefore capable of modeling context-spe…
▽ More
Probabilistic modeling is one of the foundations of modern machine learning and artificial intelligence. In this paper, we propose a novel type of probabilistic models named latent dependency forest models (LDFMs). A LDFM models the dependencies between random variables with a forest structure that can change dynamically based on the variable values. It is therefore capable of modeling context-specific independence. We parameterize a LDFM using a first-order non-projective dependency grammar. Learning LDFMs from data can be formulated purely as a parameter learning problem, and hence the difficult problem of model structure learning is circumvented. Our experimental results show that LDFMs are competitive with existing probabilistic models.
△ Less
Submitted 20 November, 2016; v1 submitted 7 September, 2016;
originally announced September 2016.
-
HoTTSQL: Proving Query Rewrites with Univalent SQL Semantics
Authors:
Shumo Chu,
Konstantin Weitz,
Alvin Cheung,
Dan Suciu
Abstract:
Every database system contains a query optimizer that performs query rewrites. Unfortunately, developing query optimizers remains a highly challenging task. Part of the challenges comes from the intricacies and rich features of query languages, which makes reasoning about rewrite rules difficult. In this paper, we propose a machine-checkable denotational semantics for SQL, the de facto language fo…
▽ More
Every database system contains a query optimizer that performs query rewrites. Unfortunately, developing query optimizers remains a highly challenging task. Part of the challenges comes from the intricacies and rich features of query languages, which makes reasoning about rewrite rules difficult. In this paper, we propose a machine-checkable denotational semantics for SQL, the de facto language for relational database, for rigorously validating rewrite rules. Unlike previously proposed semantics that are either non-mechanized or only cover a small amount of SQL language features, our semantics covers all major features of SQL, including bags, correlated subqueries, aggregation, and indexes. Our mechanized semantics, called HoTTSQL, is based on K-Relations and homotopy type theory, where we denote relations as mathematical functions from tuples to univalent types. We have implemented HoTTSQL in Coq, which takes only fewer than 300 lines of code and have proved a wide range of SQL rewrite rules, including those from database research literature (e.g., magic set rewrites) and real-world query optimizers (e.g., subquery elimination). Several of these rewrite rules have never been previously proven correct. In addition, while query equivalence is generally undecidable, we have implemented an automated decision procedure using HoTTSQL for conjunctive queries: a well-studied decidable fragment of SQL that encompasses many real-world queries.
△ Less
Submitted 5 August, 2016; v1 submitted 16 July, 2016;
originally announced July 2016.