-
NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar
Authors:
Runwei Guan,
Jianan Liu,
Liye Jia,
Haocheng Zhao,
Shanliang Yao,
Xiaohui Zhu,
Ka Lok Man,
Eng Gee Lim,
Jeremy Smith,
Yutao Yue
Abstract:
Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vehicles (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we design a low-power multi-task model named NanoMVG f…
▽ More
Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vehicles (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we design a low-power multi-task model named NanoMVG for waterway embodied perception, guiding both camera and 4D millimeter-wave radar to locate specific object(s) through natural language. NanoMVG can perform both box-level and mask-level visual grounding tasks simultaneously. Compared to other visual grounding models, NanoMVG achieves highly competitive performance on the WaterVG dataset, particularly in harsh environments and boasts ultra-low power consumption for long endurance.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization
Authors:
Geuntaek Lim,
Hyunwoo Kim,
Joonsoo Kim,
Yukyung Choi
Abstract:
Weakly supervised temporal action localization (WTAL) aims to detect action instances in untrimmed videos using only video-level annotations. Since many existing works optimize WTAL models based on action classification labels, they encounter the task discrepancy problem (i.e., localization-by-classification). To tackle this issue, recent studies have attempted to utilize action category names as…
▽ More
Weakly supervised temporal action localization (WTAL) aims to detect action instances in untrimmed videos using only video-level annotations. Since many existing works optimize WTAL models based on action classification labels, they encounter the task discrepancy problem (i.e., localization-by-classification). To tackle this issue, recent studies have attempted to utilize action category names as auxiliary semantic knowledge through vision-language pre-training (VLP). However, there are still areas where existing research falls short. Previous approaches primarily focused on leveraging textual information from language models but overlooked the alignment of dynamic human action and VLP knowledge in a joint space. Furthermore, the deterministic representation employed in previous studies struggles to capture fine-grained human motions. To address these problems, we propose a novel framework that aligns human action knowledge and VLP knowledge in a probabilistic embedding space. Moreover, we propose intra- and inter-distribution contrastive learning to enhance the probabilistic embedding space based on statistical similarities. Extensive experiments and ablation studies reveal that our method significantly outperforms all previous state-of-the-art methods. Code is available at https://github.com/sejong-rcv/PVLR.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
Eddington Ratios of Dust-obscured Quasars at $z \lesssim 1$: Evidence Supporting Dust-obscured Quasars as Young Quasars
Authors:
Dohyeong Kim,
Yongjung Kim,
Myungshin Im,
Eilat Glikman,
Minjin Kim,
Tanya Urrutia,
Gu Lim
Abstract:
Dust-obscured quasars have been suspected as the intermediate stage galaxies between merger-driven star-forming galaxies and unobscured quasars. This merger-driven galaxy evolution scenario suggests that dust-obscured quasars exhibit higher Eddington ratios ($λ_{\rm Edd}$) than those of unobscured quasars. However, their high dust obscuration poses challenges to accurately measuring their…
▽ More
Dust-obscured quasars have been suspected as the intermediate stage galaxies between merger-driven star-forming galaxies and unobscured quasars. This merger-driven galaxy evolution scenario suggests that dust-obscured quasars exhibit higher Eddington ratios ($λ_{\rm Edd}$) than those of unobscured quasars. However, their high dust obscuration poses challenges to accurately measuring their $λ_{\rm Edd}$ using commonly employed bolometric luminosity ($L_{\rm bol}$) and black hole (BH) mass ($M_{\rm BH}$) estimators based on the ultraviolet (UV) or optical luminosity. Recently, Kim et al. (2023) established new estimators for $L_{\rm bol}$ and $M_{\rm BH}$ based on mid-infrared (MIR) continuum luminosity ($L_{\rm MIR}$), which are less affected by dust obscuration. These estimators enable the study of a large number of dust-obscured quasars across a wide redshift range. In this study, we measure the $λ_{\rm Edd}$ values of 30 dust-obscured quasars at $z \lesssim 1$, the largest sample size to date, using the $L_{\rm MIR}$-based $L_{\rm bol}$ and $M_{\rm BH}$ estimators. Our findings reveal that dust-obscured quasars exhibit significantly higher $λ_{\rm Edd}$ values compared to unobscured quasars. Moreover, we confirm that the enhanced $λ_{\rm Edd}$ values of dust-obscured quasars maintain consistency across the redshift span of 0 to 1. Our results strongly support the picture that dust-obscured quasars are in the earlier stage than unobscured quasars in the merger-driven galaxy evolutionary track.
△ Less
Submitted 7 August, 2024; v1 submitted 6 August, 2024;
originally announced August 2024.
-
radarODE: An ODE-Embedded Deep Learning Model for Contactless ECG Reconstruction from Millimeter-Wave Radar
Authors:
Yuanyuan Zhang,
Runwei Guan,
Lingxiao Li,
Rui Yang,
Yutao Yue,
Eng Gee Lim
Abstract:
Radar-based contactless cardiac monitoring has become a popular research direction recently, but the fine-grained electrocardiogram (ECG) signal is still hard to reconstruct from millimeter-wave radar signal. The key obstacle is to decouple the cardiac activities in the electrical domain (i.e., ECG) from that in the mechanical domain (i.e., heartbeat), and most existing research only uses pure dat…
▽ More
Radar-based contactless cardiac monitoring has become a popular research direction recently, but the fine-grained electrocardiogram (ECG) signal is still hard to reconstruct from millimeter-wave radar signal. The key obstacle is to decouple the cardiac activities in the electrical domain (i.e., ECG) from that in the mechanical domain (i.e., heartbeat), and most existing research only uses pure data-driven methods to map such domain transformation as a black box. Therefore, this work first proposes a signal model for domain transformation, and then a novel deep learning framework called radarODE is designed to fuse the temporal and morphological features extracted from radar signals and generate ECG. In addition, ordinary differential equations are embedded in radarODE as a decoder to provide morphological prior, helping the convergence of the model training and improving the robustness under body movements. After being validated on the dataset, the proposed radarODE achieves better performance compared with the benchmark in terms of missed detection rate, root mean square error, Pearson correlation coefficient with the improvement of 9%, 16% and 19%, respectively. The validation results imply that radarODE is capable of recovering ECG signals from radar signals with high fidelity and can be potentially implemented in real-life scenarios.
△ Less
Submitted 3 August, 2024;
originally announced August 2024.
-
Measurement of muon flux behind the beam dump of J-PARC Hadron Experimental Facility
Authors:
T. Matsumura,
Y. Hirayama,
G. Y. Lim,
H. Nanjo,
T. Nomura,
K. Shiomi,
H. Watanabe
Abstract:
A muon-flux measurement behind the beam dump of the J-PARC Hadron Experimental Facility was performed with a compact muon detector that can be inserted into a vertical observing hole which was dug underground with 81 mm in diameter. The flux of the muons penetrating the beam dump was scanned vertically at intervals of 0.5 m, showing a wide distribution with a maximum at the beam level. The muon fl…
▽ More
A muon-flux measurement behind the beam dump of the J-PARC Hadron Experimental Facility was performed with a compact muon detector that can be inserted into a vertical observing hole which was dug underground with 81 mm in diameter. The flux of the muons penetrating the beam dump was scanned vertically at intervals of 0.5 m, showing a wide distribution with a maximum at the beam level. The muon flux was consistent with the expectation from a Monte-Carlo simulation at more than 1 m away from the beam axis, which is expected to be used for signal-loss evaluation in the future KOTO II experiment for measuring rare kaon decays. The data can also be used in improving the accuracy of shielding calculations in the radiation protection.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
MVG-Splatting: Multi-View Guided Gaussian Splatting with Adaptive Quantile-Based Geometric Consistency Densification
Authors:
Zhuoxiao Li,
Shanliang Yao,
Yijie Chu,
Angel F. Garcia-Fernandez,
Yong Yue,
Eng Gee Lim,
Xiaohui Zhu
Abstract:
In the rapidly evolving field of 3D reconstruction, 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) represent significant advancements. Although 2DGS compresses 3D Gaussian primitives into 2D Gaussian surfels to effectively enhance mesh extraction quality, this compression can potentially lead to a decrease in rendering quality. Additionally, unreliable densification processes and th…
▽ More
In the rapidly evolving field of 3D reconstruction, 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) represent significant advancements. Although 2DGS compresses 3D Gaussian primitives into 2D Gaussian surfels to effectively enhance mesh extraction quality, this compression can potentially lead to a decrease in rendering quality. Additionally, unreliable densification processes and the calculation of depth through the accumulation of opacity can compromise the detail of mesh extraction. To address this issue, we introduce MVG-Splatting, a solution guided by Multi-View considerations. Specifically, we integrate an optimized method for calculating normals, which, combined with image gradients, helps rectify inconsistencies in the original depth computations. Additionally, utilizing projection strategies akin to those in Multi-View Stereo (MVS), we propose an adaptive quantile-based method that dynamically determines the level of additional densification guided by depth maps, from coarse to fine detail. Experimental evidence demonstrates that our method not only resolves the issues of rendering quality degradation caused by depth discrepancies but also facilitates direct mesh extraction from dense Gaussian point clouds using the Marching Cubes algorithm. This approach significantly enhances the overall fidelity and accuracy of the 3D reconstruction process, ensuring that both the geometric details and visual quality.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
AirSketch: Generative Motion to Sketch
Authors:
Hui Xian Grace Lim,
Xuanming Cui,
Yogesh S Rawat,
Ser-Nam Lim
Abstract:
Illustration is a fundamental mode of human expression and communication. Certain types of motion that accompany speech can provide this illustrative mode of communication. While Augmented and Virtual Reality technologies (AR/VR) have introduced tools for producing drawings with hand motions (air drawing), they typically require costly hardware and additional digital markers, thereby limiting thei…
▽ More
Illustration is a fundamental mode of human expression and communication. Certain types of motion that accompany speech can provide this illustrative mode of communication. While Augmented and Virtual Reality technologies (AR/VR) have introduced tools for producing drawings with hand motions (air drawing), they typically require costly hardware and additional digital markers, thereby limiting their accessibility and portability. Furthermore, air drawing demands considerable skill to achieve aesthetic results. To address these challenges, we introduce the concept of AirSketch, aimed at generating faithful and visually coherent sketches directly from hand motions, eliminating the need for complicated headsets or markers. We devise a simple augmentation-based self-supervised training procedure, enabling a controllable image diffusion model to learn to translate from highly noisy hand tracking images to clean, aesthetically pleasing sketches, while preserving the essential visual cues from the original tracking data. We present two air drawing datasets to study this problem. Our findings demonstrate that beyond producing photo-realistic images from precise spatial inputs, controllable image diffusion can effectively produce a refined, clear sketch from a noisy input. Our work serves as an initial step towards marker-less air drawing and reveals distinct applications of controllable diffusion models to AirSketch and AR/VR in general.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Authors:
Runwei Guan,
Ruixiao Zhang,
Ningwei Ouyang,
Jianan Liu,
Ka Lok Man,
Xiaohao Cai,
Ming Xu,
Jeremy Smith,
Eng Gee Lim,
Yutao Yue,
Hui Xiong
Abstract:
Embodied perception is essential for intelligent vehicles and robots in interactive environmental understanding. However, these advancements primarily focus on vision, with limited attention given to using 3D modeling sensors, restricting a comprehensive understanding of objects in response to prompts containing qualitative and quantitative queries. Recently, as a promising automotive sensor with…
▽ More
Embodied perception is essential for intelligent vehicles and robots in interactive environmental understanding. However, these advancements primarily focus on vision, with limited attention given to using 3D modeling sensors, restricting a comprehensive understanding of objects in response to prompts containing qualitative and quantitative queries. Recently, as a promising automotive sensor with affordable cost, 4D millimeter-wave radars provide denser point clouds than conventional radars and perceive both semantic and physical characteristics of objects, thereby enhancing the reliability of perception systems. To foster the development of natural language-driven context understanding in radar scenes for 3D visual grounding, we construct the first dataset, Talk2Radar, which bridges these two modalities for 3D Referring Expression Comprehension (REC). Talk2Radar contains 8,682 referring prompt samples with 20,558 referred objects. Moreover, we propose a novel model, T-RadarNet, for 3D REC on point clouds, achieving State-Of-The-Art (SOTA) performance on the Talk2Radar dataset compared to counterparts. Deformable-FPN and Gated Graph Fusion are meticulously designed for efficient point cloud feature modeling and cross-modal fusion between radar and text features, respectively. Comprehensive experiments provide deep insights into radar-based 3D REC. We release our project at https://github.com/GuanRunwei/Talk2Radar.
△ Less
Submitted 18 July, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
The Robotic MAAO 0.7m Telescope System: Performance and Standard Photometric System
Authors:
Gu Lim,
Dohyeong Kim,
Seonghun Lim,
Myungshin Im,
Hyeonho Choi,
Jaemin Park,
Keun-Hong Park,
Junyeong Park,
Chaudhary Muskaan,
Donghyun Kim,
Hayeong Jeong
Abstract:
We introduce a 0.7m telescope system at the Miryang Arirang Astronomical Observatory (MAAO), a public observatory in Miryang, Korea. System integration and a scheduling program enable the 0.7m telescope system to operate completely robotically during nighttime, eliminating the need for human intervention. Using the 0.7m telescope system, we obtain atmospheric extinction coefficients and the zero-p…
▽ More
We introduce a 0.7m telescope system at the Miryang Arirang Astronomical Observatory (MAAO), a public observatory in Miryang, Korea. System integration and a scheduling program enable the 0.7m telescope system to operate completely robotically during nighttime, eliminating the need for human intervention. Using the 0.7m telescope system, we obtain atmospheric extinction coefficients and the zero-point magnitudes by observing standard stars. As a result, we find that atmospheric extinctions are moderate but they can sometimes increase depending on the weather conditions. The measured 5-sigma limiting magnitudes reach down to BVRI=19.4-19.6 AB mag for a point source with a total integrated time of 10 minutes under clear weather conditions, demonstrating comparable performance with other observational facilities operating under similar specifications and sky conditions. We expect that the newly established MAAO 0.7m telescope system will contribute significantly to the observational studies of astronomy. Particularly, with its capability for robotic observations, this system, although its primary duty is for public viewing, can be extensively used for the time-series observation of transients.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration
Authors:
Hans Jarett J. Ong,
Brian Godwin S. Lim
Abstract:
Effective causal discovery is essential for learning the causal graph from observational data. The linear non-Gaussian acyclic model (LiNGAM) operates under the assumption of a linear data generating process with non-Gaussian noise in determining the causal graph. Its assumption of unmeasured confounders being absent, however, poses practical limitations. In response, empirical research has shown…
▽ More
Effective causal discovery is essential for learning the causal graph from observational data. The linear non-Gaussian acyclic model (LiNGAM) operates under the assumption of a linear data generating process with non-Gaussian noise in determining the causal graph. Its assumption of unmeasured confounders being absent, however, poses practical limitations. In response, empirical research has shown that the reformulation of LiNGAM as a shortest path problem (LiNGAM-SPP) addresses this limitation. Within LiNGAM-SPP, mutual information is chosen to serve as the measure of independence. A challenge is introduced - parameter tuning is now needed due to its reliance on kNN mutual information estimators. The paper proposes a threefold enhancement to the LiNGAM-SPP framework.
First, the need for parameter tuning is eliminated by using the pairwise likelihood ratio in lieu of kNN-based mutual information. This substitution is validated on a general data generating process and benchmark real-world data sets, outperforming existing methods especially when given a larger set of features. The incorporation of prior knowledge is then enabled by a node-skipping strategy implemented on the graph representation of all causal orderings to eliminate violations based on the provided input of relative orderings. Flexibility relative to existing approaches is achieved. Last among the three enhancements is the utilization of the distribution of paths in the graph representation of all causal orderings. From this, crucial properties of the true causal graph such as the presence of unmeasured confounders and sparsity may be inferred. To some extent, the expected performance of the causal discovery algorithm may be predicted. The refinements above advance the practicality and performance of LiNGAM-SPP, showcasing the potential of graph-search-based methodologies in advancing causal discovery.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Quantitative bordism over acyclic groups and Cheeger-Gromov $ρ$-invariants
Authors:
Jae Choon Cha,
Geunho Lim
Abstract:
We prove a bordism version of Gromov's linearity conjecture over a large family of acyclic groups, for manifolds with arbitrary dimension. Every group embeds into one of these acyclic groups, and thus it follows that the conjecture is true if one allows to enlarge a given group. Our result holds in both PL and smooth categories, and for both oriented and unoriented cases. In the PL case, our resul…
▽ More
We prove a bordism version of Gromov's linearity conjecture over a large family of acyclic groups, for manifolds with arbitrary dimension. Every group embeds into one of these acyclic groups, and thus it follows that the conjecture is true if one allows to enlarge a given group. Our result holds in both PL and smooth categories, and for both oriented and unoriented cases. In the PL case, our results hold without assuming bounded local geometry. As an application, we prove that there is a universal linear bound for the Cheeger-Gromov $L^2$ $ρ$-invariants of PL $(4k-1)$-manifolds associated with arbitrary regular covers. We also show that the minimum number of simplices in a PL triangulation of $(4k-1)$-manifolds with a fixed simple homotopy type is unbounded if the fundamental group has nontrivial torsion. The proof of our main results builds on quantitative algebraic and geometric techniques over the simplicial classifying spaces of groups.
△ Less
Submitted 3 May, 2024; v1 submitted 18 April, 2024;
originally announced April 2024.
-
Referring Flexible Image Restoration
Authors:
Runwei Guan,
Rongsheng Hu,
Zhuhao Zhou,
Tianlang Xue,
Ka Lok Man,
Jeremy Smith,
Eng Gee Lim,
Weiping Ding,
Yutao Yue
Abstract:
In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image…
▽ More
In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image restoration, where a model must perceive and remove specific degradation types specified by human commands in images with multiple degradations. We term this task Referring Flexible Image Restoration (RFIR). To address this, we first construct a large-scale synthetic dataset called RFIR, comprising 153,423 samples with the degraded image, text prompt for specific degradation removal and restored image. RFIR consists of five basic degradation types: blur, rain, haze, low light and snow while six main sub-categories are included for varying degrees of degradation removal. To tackle the challenge, we propose a novel transformer-based multi-task model named TransRFIR, which simultaneously perceives degradation types in the degraded image and removes specific degradation upon text prompt. TransRFIR is based on two devised attention modules, Multi-Head Agent Self-Attention (MHASA) and Multi-Head Agent Cross Attention (MHACA), where MHASA and MHACA introduce the agent token and reach the linear complexity, achieving lower computation cost than vanilla self-attention and cross-attention and obtaining competitive performances. Our TransRFIR achieves state-of-the-art performances compared with other counterparts and is proven as an effective architecture for image restoration. We release our project at https://github.com/GuanRunwei/FIR-CP.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Evaluation of an LLM in Identifying Logical Fallacies: A Call for Rigor When Adopting LLMs in HCI Research
Authors:
Gionnieve Lim,
Simon T. Perrault
Abstract:
There is increasing interest in the adoption of LLMs in HCI research. However, LLMs may often be regarded as a panacea because of their powerful capabilities with an accompanying oversight on whether they are suitable for their intended tasks. We contend that LLMs should be adopted in a critical manner following rigorous evaluation. Accordingly, we present the evaluation of an LLM in identifying l…
▽ More
There is increasing interest in the adoption of LLMs in HCI research. However, LLMs may often be regarded as a panacea because of their powerful capabilities with an accompanying oversight on whether they are suitable for their intended tasks. We contend that LLMs should be adopted in a critical manner following rigorous evaluation. Accordingly, we present the evaluation of an LLM in identifying logical fallacies that will form part of a digital misinformation intervention. By comparing to a labeled dataset, we found that GPT-4 achieves an accuracy of 0.79, and for our intended use case that excludes invalid or unidentified instances, an accuracy of 0.90. This gives us the confidence to proceed with the application of the LLM while keeping in mind the areas where it still falls short. The paper describes our evaluation approach, results and reflections on the use of the LLM for our intended task.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models
Authors:
Gionnieve Lim,
Simon T. Perrault
Abstract:
Generative artificial intelligence (GenAI) can rapidly produce large and diverse volumes of content. This lends to it a quality of creativity which can be empowering in the early stages of design. In seeking to understand how creative ways to address practical issues can be conceived between humans and GenAI, we conducted a rapid ideation workshop with 21 participants where they used a large langu…
▽ More
Generative artificial intelligence (GenAI) can rapidly produce large and diverse volumes of content. This lends to it a quality of creativity which can be empowering in the early stages of design. In seeking to understand how creative ways to address practical issues can be conceived between humans and GenAI, we conducted a rapid ideation workshop with 21 participants where they used a large language model (LLM) to brainstorm potential solutions and evaluate them. We found that the LLM produced a greater variety of ideas that were of high quality, though not necessarily of higher quality than human-generated ideas. Participants typically prompted in a straightforward manner with concise instructions. We also observed two collaborative dynamics with the LLM fulfilling a consulting role or an assisting role depending on the goals of the users. Notably, we observed an atypical anti-collaboration dynamic where participants used an antagonistic approach to prompt the LLM.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Effects of Automated Misinformation Warning Labels on the Intents to Like, Comment and Share Posts
Authors:
Gionnieve Lim,
Simon T. Perrault
Abstract:
With fact-checking by professionals being difficult to scale on social media, algorithmic techniques have been considered. However, it is uncertain how the public may react to labels by automated fact-checkers. In this study, we investigate the use of automated warning labels derived from misinformation detection literature and investigate their effects on three forms of post engagement. Focusing…
▽ More
With fact-checking by professionals being difficult to scale on social media, algorithmic techniques have been considered. However, it is uncertain how the public may react to labels by automated fact-checkers. In this study, we investigate the use of automated warning labels derived from misinformation detection literature and investigate their effects on three forms of post engagement. Focusing on political posts, we also consider how partisanship affects engagement. In a two-phases within-subjects experiment with 200 participants, we found that the generic warnings suppressed intents to comment on and share posts, but not on the intent to like them. Furthermore, when different reasons for the labels were provided, their effects on post engagement were inconsistent, suggesting that the reasons could have undesirably motivated engagement instead. Partisanship effects were observed across the labels with higher engagement for politically congruent posts. We discuss the implications on the design and use of automated warning labels.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Fact Checking Chatbot: A Misinformation Intervention for Instant Messaging Apps and an Analysis of Trust in the Fact Checkers
Authors:
Gionnieve Lim,
Simon T. Perrault
Abstract:
In Singapore, there has been a rise in misinformation on mobile instant messaging services (MIMS). MIMS support both small peer-to-peer networks and large groups. Misinformation in the former may spread due to recipients' trust in the sender while in the latter, misinformation can directly reach a wide audience. The encryption of MIMS makes it difficult to address misinformation directly. As such,…
▽ More
In Singapore, there has been a rise in misinformation on mobile instant messaging services (MIMS). MIMS support both small peer-to-peer networks and large groups. Misinformation in the former may spread due to recipients' trust in the sender while in the latter, misinformation can directly reach a wide audience. The encryption of MIMS makes it difficult to address misinformation directly. As such, chatbots have become an alternative solution where users can disclose their chat content directly to fact checking services. To understand how effective fact checking chatbots are as an intervention and how trust in three different fact checkers (i.e., Government, News Outlets, and Artificial Intelligence) may affect this trust, we conducted a within-subjects experiment with 527 Singapore residents. We found mixed results for the fact checkers but support for the chatbot intervention overall. We also found a striking contradiction between participants' trust in the fact checkers and their behaviour towards them. Specifically, those who reported a high level of trust in the government performed worse and tended to follow the fact checking tool less when it was endorsed by the government.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar
Authors:
Runwei Guan,
Liye Jia,
Fengyufan Yang,
Shanliang Yao,
Erick Purwanto,
Xiaohui Zhu,
Eng Gee Lim,
Jeremy Smith,
Ka Lok Man,
Xuming Hu,
Yutao Yue
Abstract:
The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts describing multiple targets, with annotations at the…
▽ More
The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts describing multiple targets, with annotations at the instance level including bounding boxes and masks. Notably, WaterVG includes 11,568 samples with 34,987 referred targets, whose prompts integrates both visual and radar characteristics. The pattern of text-guided two sensors equips a finer granularity of text prompts with visual and radar features of referred targets. Moreover, we propose a low-power visual grounding model, Potamoi, which is a multi-task model with a well-designed Phased Heterogeneous Modality Fusion (PHMF) mode, including Adaptive Radar Weighting (ARW) and Multi-Head Slim Cross Attention (MHSCA). Exactly, ARW extracts required radar features to fuse with vision for prompt alignment. MHSCA is an efficient fusion module with a remarkably small parameter count and FLOPs, elegantly fusing scenario context captured by two sensors with linguistic features, which performs expressively on visual grounding tasks. Comprehensive experiments and evaluations have been conducted on WaterVG, where our Potamoi archives state-of-the-art performances compared with counterparts.
△ Less
Submitted 4 April, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Contextualized Messages Boost Graph Representations
Authors:
Brian Godwin Lim,
Galvin Brice Lim,
Renzo Roel Tan,
Kazushi Ikeda
Abstract:
Graph neural networks (GNNs) have gained significant attention in recent years for their ability to process data that may be represented as graphs. This success has prompted several studies to explore the representational capability of GNNs based on the graph isomorphism task. These works inherently assume a countable node feature representation, potentially limiting their applicability. Interesti…
▽ More
Graph neural networks (GNNs) have gained significant attention in recent years for their ability to process data that may be represented as graphs. This success has prompted several studies to explore the representational capability of GNNs based on the graph isomorphism task. These works inherently assume a countable node feature representation, potentially limiting their applicability. Interestingly, only a few theoretical works study GNNs with uncountable node feature representation. This paper presents a novel perspective on the representational capability of GNNs across all levels - node-level, neighborhood-level, and graph-level - when the space of node feature representation is uncountable. Specifically, it relaxes the injective requirement in previous works by employing an implicit pseudometric distance on the space of input to create a soft-injective function. This allows distinct inputs to produce similar outputs only if the pseudometric deems the inputs to be sufficiently similar on some representation, which is often useful in practice. As a consequence, a novel soft-isomorphic relational graph convolution network (SIR-GCN) that emphasizes non-linear and contextualized transformation of neighborhood feature representations is proposed. A mathematical discussion on the relationship between SIR-GCN and widely used GNNs is then laid out to put the contribution in context, establishing SIR-GCN as a generalization of classical GNN methodologies. Experiments on synthetic and benchmark datasets demonstrate the relative superiority of SIR-GCN, outperforming comparable models in node and graph property prediction tasks.
△ Less
Submitted 22 May, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
FSViewFusion: Few-Shots View Generation of Novel Objects
Authors:
Rukhshanda Hussain,
Hui Xian Grace Lim,
Borchun Chen,
Mubarak Shah,
Ser Nam Lim
Abstract:
Novel view synthesis has observed tremendous developments since the arrival of NeRFs. However, Nerf models overfit on a single scene, lacking generalization to out of distribution objects. Recently, diffusion models have exhibited remarkable performance on introducing generalization in view synthesis. Inspired by these advancements, we explore the capabilities of a pretrained stable diffusion mode…
▽ More
Novel view synthesis has observed tremendous developments since the arrival of NeRFs. However, Nerf models overfit on a single scene, lacking generalization to out of distribution objects. Recently, diffusion models have exhibited remarkable performance on introducing generalization in view synthesis. Inspired by these advancements, we explore the capabilities of a pretrained stable diffusion model for view synthesis without explicit 3D priors. Specifically, we base our method on a personalized text to image model, Dreambooth, given its strong ability to adapt to specific novel objects with a few shots. Our research reveals two interesting findings. First, we observe that Dreambooth can learn the high level concept of a view, compared to arguably more complex strategies which involve finetuning diffusions on large amounts of multi-view data. Second, we establish that the concept of a view can be disentangled and transferred to a novel object irrespective of the original object's identify from which the views are learnt. Motivated by this, we introduce a learning strategy, FSViewFusion, which inherits a specific view through only one image sample of a single scene, and transfers the knowledge to a novel object, learnt from few shots, using low rank adapters. Through extensive experiments we demonstrate that our method, albeit simple, is efficient in generating reliable view samples for in the wild images. Code and models will be released.
△ Less
Submitted 12 March, 2024; v1 submitted 10 March, 2024;
originally announced March 2024.
-
Development and Testing of a Novel Large Language Model-Based Clinical Decision Support Systems for Medication Safety in 12 Clinical Specialties
Authors:
Jasmine Chiat Ling Ong,
Liyuan Jin,
Kabilan Elangovan,
Gilbert Yong San Lim,
Daniel Yan Zheng Lim,
Gerald Gui Ren Sng,
Yuhe Ke,
Joshua Yi Min Tung,
Ryan Jian Zhong,
Christopher Ming Yao Koh,
Keane Zhi Hao Lee,
Xiang Chen,
Jack Kian Chng,
Aung Than,
Ken Junyang Goh,
Daniel Shu Wei Ting
Abstract:
Importance: We introduce a novel Retrieval Augmented Generation (RAG)-Large Language Model (LLM) framework as a Clinical Decision Support Systems (CDSS) to support safe medication prescription.
Objective: To evaluate the efficacy of LLM-based CDSS in correctly identifying medication errors in different patient case vignettes from diverse medical and surgical sub-disciplines, against a human expe…
▽ More
Importance: We introduce a novel Retrieval Augmented Generation (RAG)-Large Language Model (LLM) framework as a Clinical Decision Support Systems (CDSS) to support safe medication prescription.
Objective: To evaluate the efficacy of LLM-based CDSS in correctly identifying medication errors in different patient case vignettes from diverse medical and surgical sub-disciplines, against a human expert panel derived ground truth. We compared performance for under 2 different CDSS practical healthcare integration modalities: LLM-based CDSS alone (fully autonomous mode) vs junior pharmacist + LLM-based CDSS (co-pilot, assistive mode).
Design, Setting, and Participants: Utilizing a RAG model with state-of-the-art medically-related LLMs (GPT-4, Gemini Pro 1.0 and Med-PaLM 2), this study used 61 prescribing error scenarios embedded into 23 complex clinical vignettes across 12 different medical and surgical specialties. A multidisciplinary expert panel assessed these cases for Drug-Related Problems (DRPs) using the PCNE classification and graded severity / potential for harm using revised NCC MERP medication error index. We compared.
Results RAG-LLM performed better compared to LLM alone. When employed in a co-pilot mode, accuracy, recall, and F1 scores were optimized, indicating effectiveness in identifying moderate to severe DRPs. The accuracy of DRP detection with RAG-LLM improved in several categories but at the expense of lower precision.
Conclusions This study established that a RAG-LLM based CDSS significantly boosts the accuracy of medication error identification when used alongside junior pharmacists (co-pilot), with notable improvements in detecting severe DRPs. This study also illuminates the comparative performance of current state-of-the-art LLMs in RAG-based CDSS systems.
△ Less
Submitted 17 February, 2024; v1 submitted 29 January, 2024;
originally announced February 2024.
-
Help Me Reflect: Leveraging Self-Reflection Interface Nudges to Enhance Deliberativeness on Online Deliberation Platforms
Authors:
Shun Yi Yeo,
Gionnieve Lim,
Jie Gao,
Weiyu Zhang,
Simon Tangi Perrault
Abstract:
The deliberative potential of online platforms has been widely examined. However, little is known about how various interface-based reflection nudges impact the quality of deliberation. This paper presents two user studies with 12 and 120 participants, respectively, to investigate the impacts of different reflective nudges on the quality of deliberation. In the first study, we examined five distin…
▽ More
The deliberative potential of online platforms has been widely examined. However, little is known about how various interface-based reflection nudges impact the quality of deliberation. This paper presents two user studies with 12 and 120 participants, respectively, to investigate the impacts of different reflective nudges on the quality of deliberation. In the first study, we examined five distinct reflective nudges: persona, temporal prompts, analogies and metaphors, cultural prompts and storytelling. Persona, temporal prompts, and storytelling emerged as the preferred nudges for implementation on online deliberation platforms. In the second study, we assess the impacts of these preferred reflectors more thoroughly. Results revealed a significant positive impact of these reflectors on deliberative quality. Specifically, persona promotes a deliberative environment for balanced and opinionated viewpoints while temporal prompts promote more individualised viewpoints. Our findings suggest that the choice of reflectors can significantly influence the dynamics and shape the nature of online discussions.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Achelous++: Power-Oriented Water-Surface Panoptic Perception Framework on Edge Devices based on Vision-Radar Fusion and Pruning of Heterogeneous Modalities
Authors:
Runwei Guan,
Haocheng Zhao,
Shanliang Yao,
Ka Lok Man,
Xiaohui Zhu,
Limin Yu,
Yong Yue,
Jeremy Smith,
Eng Gee Lim,
Weiping Ding,
Yutao Yue
Abstract:
Urban water-surface robust perception serves as the foundation for intelligent monitoring of aquatic environments and the autonomous navigation and operation of unmanned vessels, especially in the context of waterway safety. It is worth noting that current multi-sensor fusion and multi-task learning models consume substantial power and heavily rely on high-power GPUs for inference. This contribute…
▽ More
Urban water-surface robust perception serves as the foundation for intelligent monitoring of aquatic environments and the autonomous navigation and operation of unmanned vessels, especially in the context of waterway safety. It is worth noting that current multi-sensor fusion and multi-task learning models consume substantial power and heavily rely on high-power GPUs for inference. This contributes to increased carbon emissions, a concern that runs counter to the prevailing emphasis on environmental preservation and the pursuit of sustainable, low-carbon urban environments. In light of these concerns, this paper concentrates on low-power, lightweight, multi-task panoptic perception through the fusion of visual and 4D radar data, which is seen as a promising low-cost perception method. We propose a framework named Achelous++ that facilitates the development and comprehensive evaluation of multi-task water-surface panoptic perception models. Achelous++ can simultaneously execute five perception tasks with high speed and low power consumption, including object detection, object semantic segmentation, drivable-area segmentation, waterline segmentation, and radar point cloud semantic segmentation. Furthermore, to meet the demand for developers to customize models for real-time inference on low-performance devices, a novel multi-modal pruning strategy known as Heterogeneous-Aware SynFlow (HA-SynFlow) is proposed. Besides, Achelous++ also supports random pruning at initialization with different layer-wise sparsity, such as Uniform and Erdos-Renyi-Kernel (ERK). Overall, our Achelous++ framework achieves state-of-the-art performance on the WaterScenes benchmark, excelling in both accuracy and power efficiency compared to other single-task and multi-task models. We release and maintain the code at https://github.com/GuanRunwei/Achelous.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Exploring Radar Data Representations in Autonomous Driving: A Comprehensive Review
Authors:
Shanliang Yao,
Runwei Guan,
Zitian Peng,
Chenhang Xu,
Yilu Shi,
Weiping Ding,
Eng Gee Lim,
Yong Yue,
Hyungjoon Seo,
Ka Lok Man,
Jieming Ma,
Xiaohui Zhu,
Yutao Yue
Abstract:
With the rapid advancements of sensor technology and deep learning, autonomous driving systems are providing safe and efficient access to intelligent vehicles as well as intelligent transportation. Among these equipped sensors, the radar sensor plays a crucial role in providing robust perception information in diverse environmental conditions. This review focuses on exploring different radar data…
▽ More
With the rapid advancements of sensor technology and deep learning, autonomous driving systems are providing safe and efficient access to intelligent vehicles as well as intelligent transportation. Among these equipped sensors, the radar sensor plays a crucial role in providing robust perception information in diverse environmental conditions. This review focuses on exploring different radar data representations utilized in autonomous driving systems. Firstly, we introduce the capabilities and limitations of the radar sensor by examining the working principles of radar perception and signal processing of radar measurements. Then, we delve into the generation process of five radar representations, including the ADC signal, radar tensor, point cloud, grid map, and micro-Doppler signature. For each radar representation, we examine the related datasets, methods, advantages and limitations. Furthermore, we discuss the challenges faced in these data representations and propose potential research directions. Above all, this comprehensive review offers an in-depth insight into how these representations enhance autonomous system capabilities, providing guidance for radar perception researchers. To facilitate retrieval and comparison of different data representations, datasets and methods, we provide an interactive website at https://radar-camera-fusion.github.io/radar.
△ Less
Submitted 19 April, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
Kivi: Verification for Cluster Management
Authors:
Bingzhe Liu,
Gangmuk Lim,
Ryan Beckett,
P. Brighten Godfrey
Abstract:
Modern cloud infrastructure is powered by cluster management systems such as Kubernetes and Docker Swarm. While these systems seek to minimize users' operational burden, the complex, dynamic, and non-deterministic nature of these systems makes them hard to reason about, potentially leading to failures ranging from performance degradation to outages. We present Kivi, the first system for verifying…
▽ More
Modern cloud infrastructure is powered by cluster management systems such as Kubernetes and Docker Swarm. While these systems seek to minimize users' operational burden, the complex, dynamic, and non-deterministic nature of these systems makes them hard to reason about, potentially leading to failures ranging from performance degradation to outages. We present Kivi, the first system for verifying controllers and their configurations in cluster management systems. Kivi focuses on the popular system Kubernetes, and models its controllers and events into processes whereby their interleavings are exhaustively checked via model checking. Central to handling autoscaling and large-scale deployments is our design that seeks to find violations in a smaller and reduced topology. We also develop several model optimizations in Kivi to scale to large clusters. We show that Kivi is effective and accurate in finding issues in realistic and complex scenarios and showcase two new issues in Kubernetes controller source code.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Gravitational-wave Electromagnetic Counterpart Korean Observatory (GECKO): GECKO Follow-up Observation of GW190425
Authors:
Gregory S. H. Paek,
Myungshin Im,
Joonho Kim,
Gu Lim,
Bomi Park,
Changsu Choi,
Sophia Kim,
Claudio Barbieri,
Om Sharan Salafia,
Insu Paek,
Suhyun Shin,
Jinguk Seo,
Hyung Mok Lee,
Chung-Uk Lee,
Seung-Lee Kim,
Hyun-Il Sung
Abstract:
One of the keys to the success of multimessenger astronomy is the rapid identification of the electromagnetic wave counterpart, kilonova (KN), of the gravitational-wave (GW) event. Despite its importance, it is hard to find a KN associated with a GW event, due to a poorly constrained GW localization map and numerous signals that could be confused as a KN. Here, we present the Gravitational-wave El…
▽ More
One of the keys to the success of multimessenger astronomy is the rapid identification of the electromagnetic wave counterpart, kilonova (KN), of the gravitational-wave (GW) event. Despite its importance, it is hard to find a KN associated with a GW event, due to a poorly constrained GW localization map and numerous signals that could be confused as a KN. Here, we present the Gravitational-wave Electromagnetic wave Counterpart Korean Observatory (GECKO) project, the GECKO observation of GW190425, and prospects of GECKO in the fourth observing run (O4) of the GW detectors. We outline our follow-up observation strategies during O3. In particular, we describe our galaxy-targeted observation criteria that prioritize based on galaxy properties. Armed with this strategy, we performed an optical and/or near-infrared follow-up observation of GW190425, the first binary neutron star merger event during the O3 run. Despite a vast localization area of 7460 deg^2, we observed 621 host galaxy candidates, corresponding to 29.5% of the scores we assigned, with most of them observed within the first 3 days of the GW event. Ten transients were discovered during this search, including a new transient with a host galaxy. No plausible KN was found, but we were still able to constrain the properties of potential KNe using upper limits. The GECKO observation demonstrates that GECKO can possibly uncover a GW170817-like KN at a distance less than 200 Mpc if the localization area is of the order of hundreds of square degrees, providing a bright prospect for the identification of GW electromagnetic wave counterparts during the O4 run.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
ASY-VRNet: Waterway Panoptic Driving Perception Model based on Asymmetric Fair Fusion of Vision and 4D mmWave Radar
Authors:
Runwei Guan,
Shanliang Yao,
Xiaohui Zhu,
Ka Lok Man,
Yong Yue,
Jeremy Smith,
Eng Gee Lim,
Yutao Yue
Abstract:
Panoptic Driving Perception (PDP) is critical for the autonomous navigation of Unmanned Surface Vehicles (USVs). A PDP model typically integrates multiple tasks, necessitating the simultaneous and robust execution of various perception tasks to facilitate downstream path planning. The fusion of visual and radar sensors is currently acknowledged as a robust and cost-effective approach. However, mos…
▽ More
Panoptic Driving Perception (PDP) is critical for the autonomous navigation of Unmanned Surface Vehicles (USVs). A PDP model typically integrates multiple tasks, necessitating the simultaneous and robust execution of various perception tasks to facilitate downstream path planning. The fusion of visual and radar sensors is currently acknowledged as a robust and cost-effective approach. However, most existing research has primarily focused on fusing visual and radar features dedicated to object detection or utilizing a shared feature space for multiple tasks, neglecting the individual representation differences between various tasks. To address this gap, we propose a pair of Asymmetric Fair Fusion (AFF) modules with favorable explainability designed to efficiently interact with independent features from both visual and radar modalities, tailored to the specific requirements of object detection and semantic segmentation tasks. The AFF modules treat image and radar maps as irregular point sets and transform these features into a crossed-shared feature space for multitasking, ensuring equitable treatment of vision and radar point cloud features. Leveraging AFF modules, we propose a novel and efficient PDP model, ASY-VRNet, which processes image and radar features based on irregular super-pixel point sets. Additionally, we propose an effective multitask learning method specifically designed for PDP models. Compared to other lightweight models, ASY-VRNet achieves state-of-the-art performance in object detection, semantic segmentation, and drivable-area segmentation on the WaterScenes benchmark. Our project is publicly available at https://github.com/GuanRunwei/ASY-VRNet.
△ Less
Submitted 4 July, 2024; v1 submitted 20 August, 2023;
originally announced August 2023.
-
XAI in Automated Fact-Checking? The Benefits Are Modest and There's No One-Explanation-Fits-All
Authors:
Gionnieve Lim,
Simon T. Perrault
Abstract:
The massive volume of online information along with the issue of misinformation has spurred active research in the automation of fact-checking. Like fact-checking by human experts, it is not enough for an automated fact-checker to just be accurate, but also be able to inform and convince the user of the validity of its predictions. This becomes viable with explainable artificial intelligence (XAI)…
▽ More
The massive volume of online information along with the issue of misinformation has spurred active research in the automation of fact-checking. Like fact-checking by human experts, it is not enough for an automated fact-checker to just be accurate, but also be able to inform and convince the user of the validity of its predictions. This becomes viable with explainable artificial intelligence (XAI). In this work, we conduct a study of XAI fact-checkers involving 180 participants to determine how users' actions towards news and their attitudes towards explanations are affected by the XAI. Our results suggest that XAI has limited effects on users' agreement with the veracity prediction of the automated fact-checker and on their intent to share news. However, XAI nudges users towards forming uniform judgments of news veracity, thereby signaling their reliance on the explanations. We also found polarizing preferences towards XAI and raise several design considerations on them.
△ Less
Submitted 19 June, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
First results from the JWST Early Release Science Program Q3D: Powerful quasar-driven galactic scale outflow at $z=3$
Authors:
Andrey Vayner,
Nadia L. Zakamska,
Yuzo Ishikawa,
Swetha Sankar,
Dominika Wylezalek,
David S. N. Rupke,
Sylvain Veilleux,
Caroline Bertemes,
Jorge K. Barrera-Ballesteros,
Hsiao-Wen Chen,
Nadiia Diachenko,
Andy D. Goulding,
Jenny E. Greene,
Kevin N. Hainline,
Fred Hamann,
Timothy Heckman,
Sean D. Johnson,
Hui Xian Grace Lim,
Weizhe Liu,
Dieter Lutz,
Nora Lutzgendorf,
Vincenzo Mainieri,
Ryan McCrory,
Grey Murphree,
Nicole P. H. Nesvadba
, et al. (3 additional authors not shown)
Abstract:
Quasar-driven galactic outflows are a major driver of the evolution of massive galaxies. We report observations of a powerful galactic-scale outflow in a $z=3$ extremely red, intrinsically luminous ($L_{\rm bol}\simeq 5\times 10^{47}$erg s$^{-1}$) quasar SDSSJ1652+1728 with the Near Infrared Spectrograph (NIRSpec) on board JWST. We analyze the kinematics of rest-frame optical emission lines and id…
▽ More
Quasar-driven galactic outflows are a major driver of the evolution of massive galaxies. We report observations of a powerful galactic-scale outflow in a $z=3$ extremely red, intrinsically luminous ($L_{\rm bol}\simeq 5\times 10^{47}$erg s$^{-1}$) quasar SDSSJ1652+1728 with the Near Infrared Spectrograph (NIRSpec) on board JWST. We analyze the kinematics of rest-frame optical emission lines and identify the quasar-driven outflow extending out to $\sim 10$ kpc from the quasar with a velocity offset of ($v_{r}=\pm 500$ km s$^{-1}$) and high velocity dispersion (FWHM$=700-2400$ km s$^{-1}$). Due to JWST's unprecedented surface brightness sensitivity in the near-infrared -- we unambiguously show that the powerful high velocity outflow in an extremely red quasar (ERQ) encompasses a large swath of the host galaxy's interstellar medium (ISM). Using the kinematics and dynamics of optical emission lines, we estimate the mass outflow rate -- in the warm ionized phase alone -- to be at least $2300\pm1400$ $M_{\odot}$ yr$^{-1}$. We measure a momentum flux ratio between the outflow and the quasar accretion disk of $\sim$1 on kpc scale, indicating that the outflow was likely driven in a relatively high ($>10^{23}$cm$^{-2}$) column density environment through radiation pressure on dust grains. We find a coupling efficiency between the bolometric luminosity of the quasar and the outflow of 0.1$\%$, matching the theoretical prediction of the minimum coupling efficiency necessary for negative quasar feedback. The outflow has sufficient energetics to drive the observed turbulence seen in shocked regions of the quasar host galaxy, likely directly responsible for prolonging the time it takes for gas to cool efficiently.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar
Authors:
Runwei Guan,
Shanliang Yao,
Xiaohui Zhu,
Ka Lok Man,
Eng Gee Lim,
Jeremy Smith,
Yong Yue,
Yutao Yue
Abstract:
Current perception models for different tasks usually exist in modular forms on Unmanned Surface Vehicles (USVs), which infer extremely slowly in parallel on edge devices, causing the asynchrony between perception results and USV position, and leading to error decisions of autonomous navigation. Compared with Unmanned Ground Vehicles (UGVs), the robust perception of USVs develops relatively slowly…
▽ More
Current perception models for different tasks usually exist in modular forms on Unmanned Surface Vehicles (USVs), which infer extremely slowly in parallel on edge devices, causing the asynchrony between perception results and USV position, and leading to error decisions of autonomous navigation. Compared with Unmanned Ground Vehicles (UGVs), the robust perception of USVs develops relatively slowly. Moreover, most current multi-task perception models are huge in parameters, slow in inference and not scalable. Oriented on this, we propose Achelous, a low-cost and fast unified panoptic perception framework for water-surface perception based on the fusion of a monocular camera and 4D mmWave radar. Achelous can simultaneously perform five tasks, detection and segmentation of visual targets, drivable-area segmentation, waterline segmentation and radar point cloud segmentation. Besides, models in Achelous family, with less than around 5 million parameters, achieve about 18 FPS on an NVIDIA Jetson AGX Xavier, 11 FPS faster than HybridNets, and exceed YOLOX-Tiny and Segformer-B0 on our collected dataset about 5 mAP$_{\text{50-95}}$ and 0.7 mIoU, especially under situations of adverse weather, dark environments and camera failure. To our knowledge, Achelous is the first comprehensive panoptic perception framework combining vision-level and point-cloud-level tasks for water-surface perception. To promote the development of the intelligent transportation community, we release our codes in \url{https://github.com/GuanRunwei/Achelous}.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset and Benchmarks for Autonomous Driving on Water Surfaces
Authors:
Shanliang Yao,
Runwei Guan,
Zhaodong Wu,
Yi Ni,
Zile Huang,
Ryan Wen Liu,
Yong Yue,
Weiping Ding,
Eng Gee Lim,
Hyungjoon Seo,
Ka Lok Man,
Jieming Ma,
Xiaohui Zhu,
Yutao Yue
Abstract:
Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camer…
▽ More
Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camera, our Unmanned Surface Vehicle (USV) proffers all-weather solutions for discerning object-related information, including color, shape, texture, range, velocity, azimuth, and elevation. Focusing on typical static and dynamic objects on water surfaces, we label the camera images and radar point clouds at pixel-level and point-level, respectively. In addition to basic perception tasks, such as object detection, instance segmentation and semantic segmentation, we also provide annotations for free-space segmentation and waterline segmentation. Leveraging the multi-task and multi-modal data, we conduct benchmark experiments on the uni-modality of radar and camera, as well as the fused modalities. Experimental results demonstrate that 4D radar-camera fusion can considerably improve the accuracy and robustness of perception on water surfaces, especially in adverse lighting and weather conditions. WaterScenes dataset is public on https://waterscenes.github.io.
△ Less
Submitted 15 June, 2024; v1 submitted 12 July, 2023;
originally announced July 2023.
-
On Evaluation of Document Classification using RVL-CDIP
Authors:
Stefan Larson,
Gordon Lim,
Kevin Leach
Abstract:
The RVL-CDIP benchmark is widely used for measuring performance on the task of document classification. Despite its widespread use, we reveal several undesirable characteristics of the RVL-CDIP benchmark. These include (1) substantial amounts of label noise, which we estimate to be 8.1% (ranging between 1.6% to 16.9% per document category); (2) presence of many ambiguous or multi-label documents;…
▽ More
The RVL-CDIP benchmark is widely used for measuring performance on the task of document classification. Despite its widespread use, we reveal several undesirable characteristics of the RVL-CDIP benchmark. These include (1) substantial amounts of label noise, which we estimate to be 8.1% (ranging between 1.6% to 16.9% per document category); (2) presence of many ambiguous or multi-label documents; (3) a large overlap between test and train splits, which can inflate model performance metrics; and (4) presence of sensitive personally-identifiable information like US Social Security numbers (SSNs). We argue that there is a risk in using RVL-CDIP for benchmarking document classifiers, as its limited scope, presence of errors (state-of-the-art models now achieve accuracy error rates that are within our estimated label error rate), and lack of diversity make it less than ideal for benchmarking. We further advocate for the creation of a new document classification benchmark, and provide recommendations for what characteristics such a resource should include.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
First results from the JWST Early Release Science Program Q3D: Benchmark Comparison of Optical and Mid-IR Tracers of a Dusty, Ionized Red Quasar Wind at z=0.435
Authors:
D. S. N. Rupke,
D. Wylezalek,
N. L. Zakamska,
S. Veilleux,
C. Bertemes,
Y. Ishikawa,
W. Liu,
S. Sankar,
A. Vayner,
H. X. G. Lim,
R. McCrory,
G. Murphree,
L. Whitesell,
L. Shen,
G. Liu,
J. K. Barrera-Ballesteros,
H. -W. Chen,
N. Diachenko,
A. D. Goulding,
J. E. Greene,
K. N. Hainline,
F. Hamann,
T. Heckman,
S. D. Johnson,
D. Lutz
, et al. (5 additional authors not shown)
Abstract:
The [OIII] 5007 A emission line is the most common tracer of warm, ionized outflows in active galactic nuclei across cosmic time. JWST newly allows us to use mid-infrared spectral features at both high spatial and spectral resolution to probe these same winds. Here we present a comparison of ground-based, seeing-limited [OIII] and space-based, diffraction-limited [SIV] 10.51 micron maps of the pow…
▽ More
The [OIII] 5007 A emission line is the most common tracer of warm, ionized outflows in active galactic nuclei across cosmic time. JWST newly allows us to use mid-infrared spectral features at both high spatial and spectral resolution to probe these same winds. Here we present a comparison of ground-based, seeing-limited [OIII] and space-based, diffraction-limited [SIV] 10.51 micron maps of the powerful, kiloparsec-scale outflow in the Type 1 red quasar SDSS J110648.32+480712.3. The JWST data are from the Mid-InfraRed Instrument (MIRI). There is a close match in resolution between the datasets (0."6), in ionization potential of the O$^{+2}$ and S$^{+3}$ ions (35 eV), and in line sensitivity (1e-17 to 2e-17 erg/s/cm$^2$/arcsec$^2$). The [OIII] and [SIV] line shapes match in velocity and linewidth over much of the 20 kpc outflowing nebula, and [SIV] is the brightest line in the rest-frame 3.5-19.5 micron range, demonstrating its usefulness as a mid-IR probe of quasar outflows. [OIII] is nevertheless intriniscally brighter and provides better contrast with the point-source continuum, which is strong in the mid-IR. There is a strong anticorrelation of [OIII]/[SIV] with average velocity, which is consistent with a scenario of differential obscuration between the approaching (blueshifted) and receding (redshifted) sides of the flow. The dust in the wind may also obscure the central quasar, consistent with models that attribute red quasar extinction to dusty winds.
△ Less
Submitted 11 December, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
SWAM: Revisiting Swap and OOMK for Improving Application Responsiveness on Mobile Devices
Authors:
Geunsik Lim,
Donghyun Kang,
MyungJoo Ham,
Young Ik Eom
Abstract:
Existing memory reclamation policies on mobile devices may be no longer valid because they have negative effects on the response time of running applications. In this paper, we propose SWAM, a new integrated memory management technique that complements the shortcomings of both the swapping and killing mechanism in mobile devices and improves the application responsiveness. SWAM consists of (1) Ada…
▽ More
Existing memory reclamation policies on mobile devices may be no longer valid because they have negative effects on the response time of running applications. In this paper, we propose SWAM, a new integrated memory management technique that complements the shortcomings of both the swapping and killing mechanism in mobile devices and improves the application responsiveness. SWAM consists of (1) Adaptive Swap that performs swapping adaptively into memory or storage device while managing the swap space dynamically, (2) OOM Cleaner that reclaims shared object pages in the swap space to secure available memory and storage space, and (3) EOOM Killer that terminates processes in the worst case while prioritizing the lowest initialization cost applications as victim processes first. Experimental results demonstrate that SWAM significantly reduces the number of applications killed by OOMK (6.5x lower), and improves application launch time (36% faster) and response time (41% faster), compared to the conventional schemes.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
FindVehicle and VehicleFinder: A NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system
Authors:
Runwei Guan,
Ka Lok Man,
Feifan Chen,
Shanliang Yao,
Rongsheng Hu,
Xiaohui Zhu,
Jeremy Smith,
Eng Gee Lim,
Yutao Yue
Abstract:
Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting both text and image features and mapping them to t…
▽ More
Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting both text and image features and mapping them to the same latent space to compare the similarity. However, existing methods usually use dependency analysis or semantic role-labelling techniques to find keywords related to vehicle attributes. These techniques may require a lot of pre-processing and post-processing work, and also suffer from extracting the wrong keyword when the NL query is complex. To tackle these problems and simplify, we borrow the idea from named entity recognition (NER) and construct FindVehicle, a NER dataset in the traffic domain. It has 42.3k labelled NL descriptions of vehicle tracks, containing information such as the location, orientation, type and colour of the vehicle. FindVehicle also adopts both overlapping entities and fine-grained entities to meet further requirements. To verify its effectiveness, we propose a baseline NL-based vehicle retrieval model called VehicleFinder. Our experiment shows that by using text encoders pre-trained by FindVehicle, VehicleFinder achieves 87.7\% precision and 89.4\% recall when retrieving a target vehicle by text command on our homemade dataset based on UA-DETRAC. The time cost of VehicleFinder is 279.35 ms on one ARM v8.2 CPU and 93.72 ms on one RTX A4000 GPU, which is much faster than the Transformer-based system. The dataset is open-source via the link https://github.com/GuanRunwei/FindVehicle, and the implementation can be found via the link https://github.com/GuanRunwei/VehicleFinder-CTIM.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
Radar-Camera Fusion for Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review
Authors:
Shanliang Yao,
Runwei Guan,
Xiaoyu Huang,
Zhuoxiao Li,
Xiangyu Sha,
Yong Yue,
Eng Gee Lim,
Hyungjoon Seo,
Ka Lok Man,
Xiaohui Zhu,
Yutao Yue
Abstract:
Driven by deep learning techniques, perception technology in autonomous driving has developed rapidly in recent years, enabling vehicles to accurately detect and interpret surrounding environment for safe and efficient navigation. To achieve accurate and robust perception capabilities, autonomous vehicles are often equipped with multiple sensors, making sensor fusion a crucial part of the percepti…
▽ More
Driven by deep learning techniques, perception technology in autonomous driving has developed rapidly in recent years, enabling vehicles to accurately detect and interpret surrounding environment for safe and efficient navigation. To achieve accurate and robust perception capabilities, autonomous vehicles are often equipped with multiple sensors, making sensor fusion a crucial part of the perception system. Among these fused sensors, radars and cameras enable a complementary and cost-effective perception of the surrounding environment regardless of lighting and weather conditions. This review aims to provide a comprehensive guideline for radar-camera fusion, particularly concentrating on perception tasks related to object detection and semantic segmentation.Based on the principles of the radar and camera sensors, we delve into the data processing process and representations, followed by an in-depth analysis and summary of radar-camera fusion datasets. In the review of methodologies in radar-camera fusion, we address interrogative questions, including "why to fuse", "what to fuse", "where to fuse", "when to fuse", and "how to fuse", subsequently discussing various challenges and potential research directions within this domain. To ease the retrieval and comparison of datasets and fusion methods, we also provide an interactive website: https://radar-camera-fusion.github.io.
△ Less
Submitted 23 August, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models
Authors:
Jie Gao,
Yuchen Guo,
Gionnieve Lim,
Tianqin Zhang,
Zheng Zhang,
Toby Jia-Jun Li,
Simon Tangi Perrault
Abstract:
Collaborative Qualitative Analysis (CQA) can enhance qualitative analysis rigor and depth by incorporating varied viewpoints. Nevertheless, ensuring a rigorous CQA procedure itself can be both demanding and costly. To lower this bar, we take a theoretical perspective to design the CollabCoder workflow, that integrates Large Language Models (LLMs) into key inductive CQA stages: independent open cod…
▽ More
Collaborative Qualitative Analysis (CQA) can enhance qualitative analysis rigor and depth by incorporating varied viewpoints. Nevertheless, ensuring a rigorous CQA procedure itself can be both demanding and costly. To lower this bar, we take a theoretical perspective to design the CollabCoder workflow, that integrates Large Language Models (LLMs) into key inductive CQA stages: independent open coding, iterative discussions, and final codebook creation. In the open coding phase, CollabCoder offers AI-generated code suggestions and records decision-making data. During discussions, it promotes mutual understanding by sharing this data within the coding team and using quantitative metrics to identify coding (dis)agreements, aiding in consensus-building. In the code grouping stage, CollabCoder provides primary code group suggestions, lightening the cognitive load of finalizing the codebook. A 16-user evaluation confirmed the effectiveness of CollabCoder, demonstrating its advantages over existing software and providing empirical insights into the role of LLMs in the CQA practice.
△ Less
Submitted 22 January, 2024; v1 submitted 14 April, 2023;
originally announced April 2023.
-
Large-Scale Traffic Signal Control Using Constrained Network Partition and Adaptive Deep Reinforcement Learning
Authors:
Hankang Gu,
Shangbo Wang,
Xiaoguang Ma,
Dongyao Jia,
Guoqiang Mao,
Eng Gee Lim,
Cheuk Pong Ryan Wong
Abstract:
Multi-agent Deep Reinforcement Learning (MADRL) based traffic signal control becomes a popular research topic in recent years. To alleviate the scalability issue of completely centralized RL techniques and the non-stationarity issue of completely decentralized RL techniques on large-scale traffic networks, some literature utilizes a regional control approach where the whole network is firstly part…
▽ More
Multi-agent Deep Reinforcement Learning (MADRL) based traffic signal control becomes a popular research topic in recent years. To alleviate the scalability issue of completely centralized RL techniques and the non-stationarity issue of completely decentralized RL techniques on large-scale traffic networks, some literature utilizes a regional control approach where the whole network is firstly partitioned into multiple disjoint regions, followed by applying the centralized RL approach to each region. However, the existing partitioning rules either have no constraints on the topology of regions or require the same topology for all regions. Meanwhile, no existing regional control approach explores the performance of optimal joint action in an exponentially growing regional action space when intersections are controlled by 4-phase traffic signals (EW, EWL, NS, NSL). In this paper, we propose a novel RL training framework named RegionLight to tackle the above limitations. Specifically, the topology of regions is firstly constrained to a star network which comprises one center and an arbitrary number of leaves. Next, the network partitioning problem is modeled as an optimization problem to minimize the number of regions. Then, an Adaptive Branching Dueling Q-Network (ABDQ) model is proposed to decompose the regional control task into several joint signal control sub-tasks corresponding to particular intersections. Subsequently, these sub-tasks maximize the regional benefits cooperatively. Finally, the global control strategy for the whole network is obtained by concatenating the optimal joint actions of all regions. Experimental results demonstrate the superiority of our proposed framework over all baselines under both real and synthetic datasets in all evaluation metrics.
△ Less
Submitted 7 September, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
First results from the JWST Early Release Science Program Q3D: The Warm Ionized Gas Outflow in z ~ 1.6 Quasar XID 2028 and its Impact on the Host Galaxy
Authors:
Sylvain Veilleux,
Weizhe Liu,
Andrey Vayner,
Dominika Wylezalek,
David S. N. Rupke,
Nadia L. Zakamska,
Yuzo Ishikawa,
Caroline Bertemes,
Jorge K. Barrera-Ballesteros,
Hsiao-Wen Chen,
Nadiia Diachenko,
Andy D. Goulding,
Jenny E. Greene,
Kevin N. Hainline,
Fred Hamann,
Timothy Heckman,
Sean D. Johnson,
Hui Xian Grace Lim,
Dieter Lutz,
Nora Lutzgendorf,
Vincenzo Mainieri,
Roberto Maiolino,
Ryan McCrory,
Grey Murphree,
Nicole P. H. Nesvadba
, et al. (4 additional authors not shown)
Abstract:
Quasar feedback may regulate the growth of supermassive black holes, quench coeval star formation, and impact galaxy morphology and the circumgalactic medium. However, direct evidence for quasar feedback in action at the epoch of peak black hole accretion at z ~ 2 remains elusive. A good case in point is the z = 1.6 quasar WISEA J100211.29+013706.7 (XID 2028) where past analyses of the same ground…
▽ More
Quasar feedback may regulate the growth of supermassive black holes, quench coeval star formation, and impact galaxy morphology and the circumgalactic medium. However, direct evidence for quasar feedback in action at the epoch of peak black hole accretion at z ~ 2 remains elusive. A good case in point is the z = 1.6 quasar WISEA J100211.29+013706.7 (XID 2028) where past analyses of the same ground-based data have come to different conclusions. Here we revisit this object with the integral field unit of the Near Infrared Spectrograph (NIRSpec) on board the James Webb Space Telescope (JWST) as part of Early Release Science program Q3D. The excellent angular resolution and sensitivity of the JWST data reveal new morphological and kinematic sub-structures in the outflowing gas plume. An analysis of the emission line ratios indicates that photoionization by the central quasar dominates the ionization state of the gas with no obvious sign for a major contribution from hot young stars anywhere in the host galaxy. Rest-frame near-ultraviolet emission aligned along the wide-angle cone of outflowing gas is interpreted as a scattering cone. The outflow has cleared a channel in the dusty host galaxy through which some of the quasar ionizing radiation is able to escape and heat the surrounding interstellar and circumgalactic media. The warm ionized outflow is not powerful enough to impact the host galaxy via mechanical feedback, but radiative feedback by the AGN, aided by the outflow, may help explain the unusually small molecular gas mass fraction in the galaxy host.
△ Less
Submitted 22 June, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression
Authors:
Won Jo,
Geuntaek Lim,
Gwangjin Lee,
Hyunwoo Kim,
Byungsoo Ko,
Yukyung Choi
Abstract:
In content-based video retrieval (CBVR), dealing with large-scale collections, efficiency is as important as accuracy; thus, several video-level feature-based studies have actively been conducted. Nevertheless, owing to the severe difficulty of embedding a lengthy and untrimmed video into a single feature, these studies have been insufficient for accurate retrieval compared to frame-level feature-…
▽ More
In content-based video retrieval (CBVR), dealing with large-scale collections, efficiency is as important as accuracy; thus, several video-level feature-based studies have actively been conducted. Nevertheless, owing to the severe difficulty of embedding a lengthy and untrimmed video into a single feature, these studies have been insufficient for accurate retrieval compared to frame-level feature-based studies. In this paper, we show that appropriate suppression of irrelevant frames can provide insight into the current obstacles of the video-level approaches. Furthermore, we propose a Video-to-Video Suppression network (VVS) as a solution. VVS is an end-to-end framework that consists of an easy distractor elimination stage to identify which frames to remove and a suppression weight generation stage to determine the extent to suppress the remaining frames. This structure is intended to effectively describe an untrimmed video with varying content and meaningless information. Its efficacy is proved via extensive experiments, and we show that our approach is not only state-of-the-art in video-level approaches but also has a fast inference time despite possessing retrieval capabilities close to those of frame-level approaches. Code is available at https://github.com/sejong-rcv/VVS
△ Less
Submitted 19 December, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
First results from the JWST Early Release Science Program Q3D: Ionization cone, clumpy star formation and shocks in a $z=3$ extremely red quasar host
Authors:
Andrey Vayner,
Nadia L. Zakamska,
Yuzo Ishikawa,
Swetha Sankar,
Dominika Wylezalek,
David S. N. Rupke,
Sylvain Veilleux,
Caroline Bertemes,
Jorge K. Barrera-Ballesteros,
Hsiao-Wen Chen,
Nadiia Diachenko,
Andy D. Goulding,
Jenny E. Greene,
Kevin N. Hainline,
Fred Hamann,
Timothy Heckman,
Sean D. Johnson,
Hui Xian Grace Lim,
Weizhe Liu,
Dieter Lutz,
Nora Lutzgendorf,
Vincenzo Mainieri,
Ryan McCrory,
Grey Murphree,
Nicole P. H. Nesvadba
, et al. (3 additional authors not shown)
Abstract:
Massive galaxies formed most actively at redshifts $z=1-3$ during the period known as `cosmic noon.' Here we present an emission-line study of an extremely red quasar SDSSJ165202.64+172852.3 host galaxy at $z=2.94$, based on observations with the Near Infrared Spectrograph (NIRSpec) integral field unit (IFU) on board JWST. We use standard emission-line diagnostic ratios to map the sources of gas i…
▽ More
Massive galaxies formed most actively at redshifts $z=1-3$ during the period known as `cosmic noon.' Here we present an emission-line study of an extremely red quasar SDSSJ165202.64+172852.3 host galaxy at $z=2.94$, based on observations with the Near Infrared Spectrograph (NIRSpec) integral field unit (IFU) on board JWST. We use standard emission-line diagnostic ratios to map the sources of gas ionization across the host and a swarm of companion galaxies. The quasar dominates the photoionization, but we also discover shock-excited regions orthogonal to the ionization cone and the quasar-driven outflow. These shocks could be merger-induced or -- more likely, given the presence of a powerful galactic-scale quasar outflow -- these are signatures of wide-angle outflows that can reach parts of the galaxy that are not directly illuminated by the quasar. Finally, the kinematically narrow emission associated with the host galaxy presents as a collection of 1 kpc-scale clumps forming stars at a rate of at least 200 $M_{\odot}$ yr$^{-1}$. The ISM within these clumps shows high electron densities, reaching up to 3,000 cm$^{-3}$ with metallicities ranging from half to a third solar with a positive metallicity gradient and V band extinctions up to 3 magnitudes. The star formation conditions are far more extreme in these regions than in local star-forming galaxies but consistent with that of massive galaxies at cosmic noon. JWST observations reveal an archetypical rapidly forming massive galaxy undergoing a merger, a clumpy starburst, an episode of obscured near-Eddington quasar activity, and an extremely powerful quasar outflow simultaneously.
△ Less
Submitted 25 July, 2023; v1 submitted 13 March, 2023;
originally announced March 2023.
-
The Early Light Curve of a Type Ia Supernova 2021hpr in NGC 3147: Progenitor Constraints with the Companion Interaction Model
Authors:
Gu Lim,
Myungshin Im,
Gregory S. H. Paek,
Sung-Chul Yoon,
Changsu Choi,
Sophia Kim,
J. Craig Wheeler,
Benjamin P. Thomas,
Jozsef Vinkó,
Dohyeong Kim,
Jinguk Seo,
Wonseok Kang,
Taewoo Kim,
Hyun-Il Sung,
Yonggi Kim,
Joh-Na Yoon,
Haeun Kim,
Jeongmook Kim,
Hana Bae,
Shuhrat Ehgamberdiev,
Otabek Burhonov,
Davron Mirzaqulov
Abstract:
The progenitor system of Type Ia supernovae (SNe Ia) is expected to be a close binary system of a carbon/oxygen white dwarf (WD) and a non-degenerate star or another WD. Here, we present results from a high-cadence monitoring observation of SN 2021hpr in a spiral galaxy, NGC 3147, and constraints on the progenitor system based on its early multi-color light curve data. First, we classify SN 2021hp…
▽ More
The progenitor system of Type Ia supernovae (SNe Ia) is expected to be a close binary system of a carbon/oxygen white dwarf (WD) and a non-degenerate star or another WD. Here, we present results from a high-cadence monitoring observation of SN 2021hpr in a spiral galaxy, NGC 3147, and constraints on the progenitor system based on its early multi-color light curve data. First, we classify SN 2021hpr as a normal SN Ia from its long-term photometric and spectroscopic data. More interestingly, we found a significant "early excess" in the light curve over a simple power-law $\sim t^{2}$ evolution. The early light curve evolves from blue to red and blue during the first week. To explain this, we fitted the early part of $BVRI$-band light curves with a two-component model of the ejecta-companion interaction and a simple power-law model. The early excess and its color can be explained by shock cooling emission due to a companion star having a radius of $8.84\pm0.58$$R_{\odot}$. We also examined HST pre-explosion images with no detection of a progenitor candidate, consistent with the above result. However, we could not detect signs of a significant amount of the stripped mass from a non-degenerate companion star ($\lesssim0.003\,M_{\odot}$ for H$α$ emission). The early excess light in the multi-band light curve supports a non-degenerate companion in the progenitor system of SN 2021hpr. At the same time, the non-detection of emission lines opens a door for other methods to explain this event.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
A combinatorial proof of the general identity of He-Nie-Yu
Authors:
Dong Gyu Lim
Abstract:
We give a uniform and combinatorial proof of the general identity appearing in the work of He-Nie-Yu on the affine Deligne-Lusztig varieties with finite Coxeter part.
We give a uniform and combinatorial proof of the general identity appearing in the work of He-Nie-Yu on the affine Deligne-Lusztig varieties with finite Coxeter part.
△ Less
Submitted 8 March, 2023; v1 submitted 26 February, 2023;
originally announced February 2023.
-
Nonemptiness of single affine Deligne-Lusztig varieties
Authors:
Dong Gyu Lim
Abstract:
Affine Deligne-Lusztig varieties with various level structures show up in the study of Shimura varieties and moduli spaces of shtukas. Among is the Iwahori level structure which is the most refined one. We study the nonemptiness problem of single affine Deligne-Lusztig varieties at Iwahori level in the basic case. Under a genericity condition (the ``shrunken Weyl chambers'' condition), an explicit…
▽ More
Affine Deligne-Lusztig varieties with various level structures show up in the study of Shimura varieties and moduli spaces of shtukas. Among is the Iwahori level structure which is the most refined one. We study the nonemptiness problem of single affine Deligne-Lusztig varieties at Iwahori level in the basic case. Under a genericity condition (the ``shrunken Weyl chambers'' condition), an explicit criterion is known. However, no explicit criterion has been available without the condition even conjecturally. We conjecture a new criterion in full generality, and prove it except for finitely many cases. As an application, the nonemptiness problem for special cases and a new conjectural dimension formula are discussed.
△ Less
Submitted 7 March, 2023; v1 submitted 9 February, 2023;
originally announced February 2023.
-
Bounds on Cheeger-Gromov invariants and simplicial complexity of triangulated manifolds
Authors:
Geunho Lim,
Shmuel Weinberger
Abstract:
We show the existence of linear bounds on Wall $ρ$-invariants of PL manifolds, employing a new combinatorial concept of $G$-colored polyhedra. As application, we show that how the number of h-cobordism classes of manifolds simple homotopy equivalent to a lens space with $V$ simplices and the fundamental group of $\mathbb{Z}_n$ grows in $V$. Furthermore we count the number of homotopy lens spaces w…
▽ More
We show the existence of linear bounds on Wall $ρ$-invariants of PL manifolds, employing a new combinatorial concept of $G$-colored polyhedra. As application, we show that how the number of h-cobordism classes of manifolds simple homotopy equivalent to a lens space with $V$ simplices and the fundamental group of $\mathbb{Z}_n$ grows in $V$. Furthermore we count the number of homotopy lens spaces with bounded geometry in $V$. Similarly, we give new linear bounds on Cheeger-Gromov $ρ$-invariants of PL manifolds endowed with a faithful representation also. A key idea is to construct a cobordism with a linear complexity whose boundary is $π_1$-injectively embedded, using relative hyperbolization. As application, we study the complexity theory of high-dimensional lens spaces. Lastly we show the density of $ρ$-invariants over manifolds homotopy equivalent to a given manifold for certain fundamental groups. This implies that the structure set is not finitely generated.
△ Less
Submitted 19 January, 2024; v1 submitted 20 January, 2023;
originally announced January 2023.
-
Diagnosis of COVID-19 based on Chest Radiography
Authors:
Mei Gah Lim,
Hoi Leong Lee
Abstract:
The Coronavirus disease 2019 (COVID-19) was first identified in Wuhan, China, in early December 2019 and now becoming a pandemic. When COVID-19 patients undergo radiography examination, radiologists can observe the present of radiographic abnormalities from their chest X-ray (CXR) images. In this study, a deep convolutional neural network (CNN) model was proposed to aid radiologists in diagnosing…
▽ More
The Coronavirus disease 2019 (COVID-19) was first identified in Wuhan, China, in early December 2019 and now becoming a pandemic. When COVID-19 patients undergo radiography examination, radiologists can observe the present of radiographic abnormalities from their chest X-ray (CXR) images. In this study, a deep convolutional neural network (CNN) model was proposed to aid radiologists in diagnosing COVID-19 patients. First, this work conducted a comparative study on the performance of modified VGG-16, ResNet-50 and DenseNet-121 to classify CXR images into normal, COVID-19 and viral pneumonia. Then, the impact of image augmentation on the classification results was evaluated. The publicly available COVID-19 Radiography Database was used throughout this study. After comparison, ResNet-50 achieved the highest accuracy with 95.88%. Next, after training ResNet-50 with rotation, translation, horizontal flip, intensity shift and zoom augmented dataset, the accuracy dropped to 80.95%. Furthermore, an ablation study on the effect of image augmentation on the classification results found that the combinations of rotation and intensity shift augmentation methods obtained an accuracy higher than baseline, which is 96.14%. Finally, ResNet-50 with rotation and intensity shift augmentations performed the best and was proposed as the final classification model in this work. These findings demonstrated that the proposed classification model can provide a promising result for COVID-19 diagnosis.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Fully and Weakly Supervised Referring Expression Segmentation with End-to-End Learning
Authors:
Hui Li,
Mingjie Sun,
Jimin Xiao,
Eng Gee Lim,
Yao Zhao
Abstract:
Referring Expression Segmentation (RES), which is aimed at localizing and segmenting the target according to the given language expression, has drawn increasing attention. Existing methods jointly consider the localization and segmentation steps, which rely on the fused visual and linguistic features for both steps. We argue that the conflict between the purpose of identifying an object and genera…
▽ More
Referring Expression Segmentation (RES), which is aimed at localizing and segmenting the target according to the given language expression, has drawn increasing attention. Existing methods jointly consider the localization and segmentation steps, which rely on the fused visual and linguistic features for both steps. We argue that the conflict between the purpose of identifying an object and generating a mask limits the RES performance. To solve this problem, we propose a parallel position-kernel-segmentation pipeline to better isolate and then interact the localization and segmentation steps. In our pipeline, linguistic information will not directly contaminate the visual feature for segmentation. Specifically, the localization step localizes the target object in the image based on the referring expression, and then the visual kernel obtained from the localization step guides the segmentation step. This pipeline also enables us to train RES in a weakly-supervised way, where the pixel-level segmentation labels are replaced by click annotations on center and corner points. The position head is fully-supervised and trained with the click annotations as supervision, and the segmentation head is trained with weakly-supervised segmentation losses. To validate our framework on a weakly-supervised setting, we annotated three RES benchmark datasets (RefCOCO, RefCOCO+ and RefCOCOg) with click annotations.Our method is simple but surprisingly effective, outperforming all previous state-of-the-art RES methods on fully- and weakly-supervised settings by a large margin. The benchmark code and datasets will be released.
△ Less
Submitted 17 December, 2022;
originally announced December 2022.
-
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Authors:
Stefan Larson,
Gordon Lim,
Yutong Ai,
David Kuang,
Kevin Leach
Abstract:
The ability of a document classifier to handle inputs that are drawn from a distribution different from the training distribution is crucial for robust deployment and generalizability. The RVL-CDIP corpus is the de facto standard benchmark for document classification, yet to our knowledge all studies that use this corpus do not include evaluation on out-of-distribution documents. In this paper, we…
▽ More
The ability of a document classifier to handle inputs that are drawn from a distribution different from the training distribution is crucial for robust deployment and generalizability. The RVL-CDIP corpus is the de facto standard benchmark for document classification, yet to our knowledge all studies that use this corpus do not include evaluation on out-of-distribution documents. In this paper, we curate and release a new out-of-distribution benchmark for evaluating out-of-distribution performance for document classifiers. Our new out-of-distribution benchmark consists of two types of documents: those that are not part of any of the 16 in-domain RVL-CDIP categories (RVL-CDIP-O), and those that are one of the 16 in-domain categories yet are drawn from a distribution different from that of the original RVL-CDIP dataset (RVL-CDIP-N). While prior work on document classification for in-domain RVL-CDIP documents reports high accuracy scores, we find that these models exhibit accuracy drops of between roughly 15-30% on our new out-of-domain RVL-CDIP-N benchmark, and further struggle to distinguish between in-domain RVL-CDIP-N and out-of-domain RVL-CDIP-O inputs. Our new benchmark provides researchers with a valuable new resource for analyzing out-of-distribution performance on document classifiers. Our new out-of-distribution data can be found at https://github.com/gxlarson/rvl-cdip-ood.
△ Less
Submitted 18 January, 2023; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Search for the Pair Production of Dark Particles $X$ with $K_L^0 \to XX$, $X \to γγ$
Authors:
C. Lin,
J. K. Ahn,
J. M. Choi,
M. S. Farrington,
M. Gonzalez,
N. Grethen,
Y. B. Hsiung,
T. Inagaki,
I. Kamiji,
E. J. Kim,
J. L. Kim,
H. M. Kim,
K. Kawata,
A. Kitagawa,
T. K. Komatsubara,
K. Kotera,
S. K. Lee,
J. W. Lee,
G. Y. Lim,
Y. Luo,
T. Matsumura,
K. Nakagiri,
H. Nanjo,
T. Nomura,
K. Ono
, et al. (17 additional authors not shown)
Abstract:
We present the first search for the pair production of dark particles $X$ via $K_L^0\to XX$ with $X$ decaying into two photons using the data collected by the KOTO experiment. No signal was observed in the mass range of 40 - 110 MeV/c$^2$ and 210 - 240 MeV/c$^2$. This sets upper limits on the branching fractions as $\mathcal{B}(K_L^0 \to XX)$ $<$ (1-4) $\times$ 10$^{-7}$ and…
▽ More
We present the first search for the pair production of dark particles $X$ via $K_L^0\to XX$ with $X$ decaying into two photons using the data collected by the KOTO experiment. No signal was observed in the mass range of 40 - 110 MeV/c$^2$ and 210 - 240 MeV/c$^2$. This sets upper limits on the branching fractions as $\mathcal{B}(K_L^0 \to XX)$ $<$ (1-4) $\times$ 10$^{-7}$ and $\mathcal{B}(K_L^0 \to XX)$ $<$ (1-2) $\times$ 10$^{-6}$ at the 90% confidence level for the two mass regions, respectively.
△ Less
Submitted 6 February, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Simulation of angular resolution of a new electromagnetic sampling calorimeter
Authors:
Junlee Kim,
Eun-Joo Kim,
YoungJun Kim,
JungKeun Ahn,
GeiYoub Lim
Abstract:
We report on the simulation results for the angular resolution of an electromagnetic (EM) sampling calorimeter with photons in the range of 100~MeV to 2~GeV. The simulation model of the EM calorimeter consists of alternating layers of a 1-mm-thick lead plate and a 5-mm-thick plastic scintillator plate. The scintillator plates are alternately segmented into horizontal and vertical strips. In this s…
▽ More
We report on the simulation results for the angular resolution of an electromagnetic (EM) sampling calorimeter with photons in the range of 100~MeV to 2~GeV. The simulation model of the EM calorimeter consists of alternating layers of a 1-mm-thick lead plate and a 5-mm-thick plastic scintillator plate. The scintillator plates are alternately segmented into horizontal and vertical strips. In this study, we obtain energy deposits in individual strips using Geant4 simulations and reconstruct the incident photon angles using XGBoost with gradient-boosted decision trees. The performance of the angle reconstruction depends on the detector configuration and the accuracy of machine learning. The angular resolution is well described by the expression $0.24^{\circ} \oplus 1.25^{\circ}/\sqrt{E_γ}$, where $E_γ$ is the incident photon energy in GeV, for strips of 15 mm and 32 layers. This energy dependence is consistent for different incident angles in the range of 10$^{\circ}$ to 40$^{\circ}$.
△ Less
Submitted 1 February, 2023; v1 submitted 17 August, 2022;
originally announced August 2022.
-
The connected components of affine Deligne--Lusztig varieties
Authors:
Ian Gleason,
Dong Gyu Lim,
Yujie Xu
Abstract:
We compute the connected components of arbitrary parahoric level affine Deligne--Lusztig varieties and local Shimura varieties, thus resolving the conjecture raised in \cite{He} in full generality (even for non-quasisplit groups). We achieve this by relating them to the connected components of infinite level moduli spaces of $p$-adic shtukas, where we use v-sheaf-theoretic techniques such as the s…
▽ More
We compute the connected components of arbitrary parahoric level affine Deligne--Lusztig varieties and local Shimura varieties, thus resolving the conjecture raised in \cite{He} in full generality (even for non-quasisplit groups). We achieve this by relating them to the connected components of infinite level moduli spaces of $p$-adic shtukas, where we use v-sheaf-theoretic techniques such as the specialization map of \textit{kimberlites}. Along the way, we give a $p$-adic Hodge-theoretic characterization of HN-irreducibility.
As applications, we obtain many results on the geometry of integral models of Shimura varieties at arbitrary parahoric levels. In particular, we deduce new CM lifting results on integral models of Shimura varieties for quasi-split groups at arbitrary connected parahoric levels.
△ Less
Submitted 8 January, 2023; v1 submitted 15 August, 2022;
originally announced August 2022.