Search | arXiv e-print repository

Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction

Authors: Diwen Wan, Ruijie Lu, Gang Zeng

Abstract: Rendering novel view images in dynamic scenes is a crucial yet challenging task. Current methods mainly utilize NeRF-based methods to represent the static scene and an additional time-variant MLP to model scene deformations, resulting in relatively low rendering quality as well as slow inference speed. To tackle these challenges, we propose a novel framework named Superpoint Gaussian Splatting (SP… ▽ More Rendering novel view images in dynamic scenes is a crucial yet challenging task. Current methods mainly utilize NeRF-based methods to represent the static scene and an additional time-variant MLP to model scene deformations, resulting in relatively low rendering quality as well as slow inference speed. To tackle these challenges, we propose a novel framework named Superpoint Gaussian Splatting (SP-GS). Specifically, our framework first employs explicit 3D Gaussians to reconstruct the scene and then clusters Gaussians with similar properties (e.g., rotation, translation, and location) into superpoints. Empowered by these superpoints, our method manages to extend 3D Gaussian splatting to dynamic scenes with only a slight increase in computational expense. Apart from achieving state-of-the-art visual quality and real-time rendering under high resolutions, the superpoint representation provides a stronger manipulation capability. Extensive experiments demonstrate the practicality and effectiveness of our approach on both synthetic and real-world datasets. Please see our project page at https://dnvtmf.github.io/SP_GS.github.io. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted by ICML 2024

arXiv:2403.11081 [pdf, other]

Enhanced Index Modulation Aided Non-Orthogonal Multiple Access via Constellation Rotation

Authors: Ronglan Huang, Fei ji, Zeng Hu, Dehuan Wan, Pengcheng Xu, Yun Liu

Abstract: Non-orthogonal multiple access (NOMA) has been widely nominated as an emerging spectral efficiency (SE) multiple access technique for the next generation of wireless communication network. To meet the growing demands in massive connectivity and huge data in transmission, a novel index modulation aided NOMA with the rotation of signal constellation of low power users (IM-NOMA-RC) is developed to th… ▽ More Non-orthogonal multiple access (NOMA) has been widely nominated as an emerging spectral efficiency (SE) multiple access technique for the next generation of wireless communication network. To meet the growing demands in massive connectivity and huge data in transmission, a novel index modulation aided NOMA with the rotation of signal constellation of low power users (IM-NOMA-RC) is developed to the downlink transmission. In the proposed IM-NOMA-RC system, the users are classified into far-user group and near-user group according to their channel conditions, where the rotation constellation based IM operation is performed only on the users who belong to the near-user group that are allocated lower power compared with the far ones to transmit extra information. In the proposed IM-NOMA-RC, all the subcarriers are activated to transmit information to multiple users to achieve higher SE. With the aid of the multiple dimension modulation in IM-NOMA-RC, more users can be supported over an orthogonal resource block. Then, both maximum likelihood (ML) detector and successive interference cancellation (SIC) detector are studied for all the user. Numerical simulation results of the proposed IM-NOMARC scheme are investigate for the ML detector and the SIC detector for each users, which shows that proposed scheme can outperform conventional NOMA. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.02325 [pdf, other]

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training

Authors: David Wan, Jaemin Cho, Elias Stengel-Eskin, Mohit Bansal

Abstract: Highlighting particularly relevant regions of an image can improve the performance of vision-language models (VLMs) on various vision-language (VL) tasks by guiding the model to attend more closely to these regions of interest. For example, VLMs can be given a "visual prompt", where visual markers such as bounding boxes delineate key image regions. However, current VLMs that can incorporate visual… ▽ More Highlighting particularly relevant regions of an image can improve the performance of vision-language models (VLMs) on various vision-language (VL) tasks by guiding the model to attend more closely to these regions of interest. For example, VLMs can be given a "visual prompt", where visual markers such as bounding boxes delineate key image regions. However, current VLMs that can incorporate visual guidance are either proprietary and expensive or require costly training on curated data that includes visual prompts. We introduce Contrastive Region Guidance (CRG), a training-free guidance method that enables open-source VLMs to respond to visual prompts. CRG contrasts model outputs produced with and without visual prompts, factoring out biases revealed by the model when answering without the information required to produce a correct answer (i.e., the model's prior). CRG achieves substantial improvements in a wide variety of VL tasks: When region annotations are provided, CRG increases absolute accuracy by up to 11.1% on ViP-Bench, a collection of six diverse region-based tasks such as recognition, math, and object relationship reasoning. We also show CRG's applicability to spatial reasoning, with 10% improvement on What'sUp, as well as to compositional generalization -- improving accuracy by 11.5% and 7.5% on two challenging splits from SugarCrepe -- and to image-text alignment for generated images, where we improve by up to 8.4 AUROC and 6.8 F1 points on SeeTRUE. When reference regions are absent, CRG allows us to re-rank proposed regions in referring expression comprehension and phrase grounding benchmarks like RefCOCO/+/g and Flickr30K Entities, with an average gain of 3.2% in accuracy. Our analysis explores alternative masking strategies for CRG, quantifies CRG's probability shift, and evaluates the role of region guidance strength, empirically validating CRG's design choices. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: Project website: https://contrastive-region-guidance.github.io/

arXiv:2401.16355 [pdf, other]

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology

Authors: Yuxuan Sun, Hao Wu, Chenglu Zhu, Sunyi Zheng, Qizi Chen, Kai Zhang, Yunlong Zhang, Dan Wan, Xiaoxiao Lan, Mengyue Zheng, Jingxiong Li, Xinheng Lyu, Tao Lin, Lin Yang

Abstract: The emergence of large multimodal models has unlocked remarkable potential in AI, particularly in pathology. However, the lack of specialized, high-quality benchmark impeded their development and precise evaluation. To address this, we introduce PathMMU, the largest and highest-quality expert-validated pathology benchmark for Large Multimodal Models (LMMs). It comprises 33,428 multimodal multi-cho… ▽ More The emergence of large multimodal models has unlocked remarkable potential in AI, particularly in pathology. However, the lack of specialized, high-quality benchmark impeded their development and precise evaluation. To address this, we introduce PathMMU, the largest and highest-quality expert-validated pathology benchmark for Large Multimodal Models (LMMs). It comprises 33,428 multimodal multi-choice questions and 24,067 images from various sources, each accompanied by an explanation for the correct answer. The construction of PathMMU harnesses GPT-4V's advanced capabilities, utilizing over 30,000 image-caption pairs to enrich captions and generate corresponding Q&As in a cascading process. Significantly, to maximize PathMMU's authority, we invite seven pathologists to scrutinize each question under strict standards in PathMMU's validation and test sets, while simultaneously setting an expert-level performance benchmark for PathMMU. We conduct extensive evaluations, including zero-shot assessments of 14 open-sourced and 4 closed-sourced LMMs and their robustness to image corruption. We also fine-tune representative LMMs to assess their adaptability to PathMMU. The empirical findings indicate that advanced LMMs struggle with the challenging PathMMU benchmark, with the top-performing LMM, GPT-4V, achieving only a 49.8% zero-shot performance, significantly lower than the 71.8% demonstrated by human pathologists. After fine-tuning, significantly smaller open-sourced LMMs can outperform GPT-4V but still fall short of the expertise shown by pathologists. We hope that the PathMMU will offer valuable insights and foster the development of more specialized, next-generation LMMs for pathology. △ Less

Submitted 20 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 27 pages, 12 figures

arXiv:2311.16832 [pdf, other]

CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models

Authors: Jinfeng Zhou, Zhuang Chen, Dazhen Wan, Bosi Wen, Yi Song, Jifan Yu, Yongkang Huang, Libiao Peng, Jiaming Yang, Xiyao Xiao, Sahand Sabour, Xiaohan Zhang, Wenjing Hou, Yijia Zhang, Yuxiao Dong, Jie Tang, Minlie Huang

Abstract: In this paper, we present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters. Our CharacterGLM is designed for generating Character-based Dialogues (CharacterDial), which aims to equip a conversational AI system with character customization for satisfying people's inherent social desires and emotional needs. On top of CharacterGLM, we can custom… ▽ More In this paper, we present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters. Our CharacterGLM is designed for generating Character-based Dialogues (CharacterDial), which aims to equip a conversational AI system with character customization for satisfying people's inherent social desires and emotional needs. On top of CharacterGLM, we can customize various AI characters or social agents by configuring their attributes (identities, interests, viewpoints, experiences, achievements, social relationships, etc.) and behaviors (linguistic features, emotional expressions, interaction patterns, etc.). Our model outperforms most mainstream close-source large langauge models, including the GPT series, especially in terms of consistency, human-likeness, and engagement according to manual evaluations. We will release our 6B version of CharacterGLM and a subset of training data to facilitate further research development in the direction of character-based dialogue generation. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: Work in progress

arXiv:2305.16233 [pdf, other]

Interactive Segment Anything NeRF with Feature Imitation

Authors: Xiaokang Chen, Jiaxiang Tang, Diwen Wan, Jingbo Wang, Gang Zeng

Abstract: This paper investigates the potential of enhancing Neural Radiance Fields (NeRF) with semantics to expand their applications. Although NeRF has been proven useful in real-world applications like VR and digital creation, the lack of semantics hinders interaction with objects in complex scenes. We propose to imitate the backbone feature of off-the-shelf perception models to achieve zero-shot semanti… ▽ More This paper investigates the potential of enhancing Neural Radiance Fields (NeRF) with semantics to expand their applications. Although NeRF has been proven useful in real-world applications like VR and digital creation, the lack of semantics hinders interaction with objects in complex scenes. We propose to imitate the backbone feature of off-the-shelf perception models to achieve zero-shot semantic segmentation with NeRF. Our framework reformulates the segmentation process by directly rendering semantic features and only applying the decoder from perception models. This eliminates the need for expensive backbones and benefits 3D consistency. Furthermore, we can project the learned semantics onto extracted mesh surfaces for real-time interaction. With the state-of-the-art Segment Anything Model (SAM), our framework accelerates segmentation by 16 times with comparable mask quality. The experimental results demonstrate the efficacy and computational advantages of our approach. Project page: \url{https://me.kiui.moe/san/}. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: Technical Report

arXiv:2305.04782 [pdf, other]

HistAlign: Improving Context Dependency in Language Generation by Aligning with History

Authors: David Wan, Shiyue Zhang, Mohit Bansal

Abstract: Language models (LMs) can generate hallucinations and incoherent outputs, which highlights their weak context dependency. Cache-LMs, which augment LMs with a memory of recent history, can increase context dependency and have shown remarkable performance in diverse language generation tasks. However, we find that even with training, the performance gain stemming from the cache component of current… ▽ More Language models (LMs) can generate hallucinations and incoherent outputs, which highlights their weak context dependency. Cache-LMs, which augment LMs with a memory of recent history, can increase context dependency and have shown remarkable performance in diverse language generation tasks. However, we find that even with training, the performance gain stemming from the cache component of current cache-LMs is suboptimal due to the misalignment between the current hidden states and those stored in the memory. In this work, we present HistAlign, a new training approach to ensure good cache alignment such that the model receives useful signals from the history. We first prove our concept on a simple and synthetic task where the memory is essential for correct predictions, and we show that the cache component of HistAlign is better aligned and improves overall performance. Next, we evaluate HistAlign on diverse downstream language generation tasks, including prompt continuation, abstractive summarization, and data-to-text. We demonstrate that HistAlign improves text coherence and faithfulness in open-ended and conditional generation settings respectively. HistAlign is also generalizable across different model families, showcasing its strength in improving context dependency of LMs in diverse scenarios. Our code is publicly available at https://github.com/meetdavidwan/histalign △ Less

Submitted 3 December, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: EMNLP 2023 (20 pages)

arXiv:2303.03278 [pdf, other]

Faithfulness-Aware Decoding Strategies for Abstractive Summarization

Authors: David Wan, Mengwen Liu, Kathleen McKeown, Markus Dreyer, Mohit Bansal

Abstract: Despite significant progress in understanding and improving faithfulness in abstractive summarization, the question of how decoding strategies affect faithfulness is less studied. We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization. We find a consistent trend where beam search with large beam siz… ▽ More Despite significant progress in understanding and improving faithfulness in abstractive summarization, the question of how decoding strategies affect faithfulness is less studied. We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization. We find a consistent trend where beam search with large beam sizes produces the most faithful summaries while nucleus sampling generates the least faithful ones. We propose two faithfulness-aware generation methods to further improve faithfulness over current generation techniques: (1) ranking candidates generated by beam search using automatic faithfulness metrics and (2) incorporating lookahead heuristics that produce a faithfulness score on the future summary. We show that both generation methods significantly improve faithfulness across two datasets as evaluated by four automatic faithfulness metrics and human evaluation. To reduce computational cost, we demonstrate a simple distillation approach that allows the model to generate faithful summaries with just greedy decoding. Our code is publicly available at https://github.com/amazon-science/faithful-summarization-generation △ Less

Submitted 6 March, 2023; originally announced March 2023.

Comments: EACL 2023 (17 pages)

arXiv:2211.17148 [pdf, other]

ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format

Authors: Qi Zhu, Christian Geishauser, Hsien-chin Lin, Carel van Niekerk, Baolin Peng, Zheng Zhang, Michael Heck, Nurul Lubis, Dazhen Wan, Xiaochen Zhu, Jianfeng Gao, Milica Gašić, Minlie Huang

Abstract: Task-oriented dialogue (TOD) systems function as digital assistants, guiding users through various tasks such as booking flights or finding restaurants. Existing toolkits for building TOD systems often fall short of in delivering comprehensive arrays of data, models, and experimental environments with a user-friendly experience. We introduce ConvLab-3: a multifaceted dialogue system toolkit crafte… ▽ More Task-oriented dialogue (TOD) systems function as digital assistants, guiding users through various tasks such as booking flights or finding restaurants. Existing toolkits for building TOD systems often fall short of in delivering comprehensive arrays of data, models, and experimental environments with a user-friendly experience. We introduce ConvLab-3: a multifaceted dialogue system toolkit crafted to bridge this gap. Our unified data format simplifies the integration of diverse datasets and models, significantly reducing complexity and cost for studying generalization and transfer. Enhanced with robust reinforcement learning (RL) tools, featuring a streamlined training process, in-depth evaluation tools, and a selection of user simulators, ConvLab-3 supports the rapid development and evaluation of robust dialogue policies. Through an extensive study, we demonstrate the efficacy of transfer learning and RL and showcase that ConvLab-3 is not only a powerful tool for seasoned researchers but also an accessible platform for newcomers. △ Less

Submitted 17 October, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

arXiv:2211.02580 [pdf, other]

Evaluating and Improving Factuality in Multimodal Abstractive Summarization

Authors: David Wan, Mohit Bansal

Abstract: Current metrics for evaluating factuality for abstractive document summarization have achieved high correlations with human judgment, but they do not account for the vision modality and thus are not adequate for vision-and-language summarization. We propose CLIPBERTScore, a simple weighted combination of CLIPScore and BERTScore to leverage the robustness and strong factuality detection performance… ▽ More Current metrics for evaluating factuality for abstractive document summarization have achieved high correlations with human judgment, but they do not account for the vision modality and thus are not adequate for vision-and-language summarization. We propose CLIPBERTScore, a simple weighted combination of CLIPScore and BERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary, respectively. Next, due to the lack of meta-evaluation benchmarks to evaluate the quality of multimodal factuality metrics, we collect human judgments of factuality with respect to documents and images. We show that this simple combination of two metrics in the zero-shot setting achieves higher correlations than existing factuality metrics for document summarization, outperforms an existing multimodal summarization metric, and performs competitively with strong multimodal factuality metrics specifically fine-tuned for the task. Our thorough analysis demonstrates the robustness and high correlation of CLIPBERTScore and its components on four factuality metric-evaluation benchmarks. Finally, we demonstrate two practical downstream applications of our CLIPBERTScore metric: for selecting important images to focus on during training, and as a reward for reinforcement learning to improve factuality of multimodal summary generation w.r.t automatic and human evaluation. Our data and code are publicly available at https://github.com/meetdavidwan/faithful-multimodal-summ △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: EMNLP 2022 (17 pages)

arXiv:2209.03549 [pdf, other]

Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization

Authors: Shiyue Zhang, David Wan, Mohit Bansal

Abstract: The problems of unfaithful summaries have been widely discussed under the context of abstractive summarization. Though extractive summarization is less prone to the common unfaithfulness issues of abstractive summaries, does that mean extractive is equal to faithful? Turns out that the answer is no. In this work, we define a typology with five types of broad unfaithfulness problems (including and… ▽ More The problems of unfaithful summaries have been widely discussed under the context of abstractive summarization. Though extractive summarization is less prone to the common unfaithfulness issues of abstractive summaries, does that mean extractive is equal to faithful? Turns out that the answer is no. In this work, we define a typology with five types of broad unfaithfulness problems (including and beyond not-entailment) that can appear in extractive summaries, including incorrect coreference, incomplete coreference, incorrect discourse, incomplete discourse, as well as other misleading information. We ask humans to label these problems out of 1600 English summaries produced by 16 diverse extractive systems. We find that 30% of the summaries have at least one of the five issues. To automatically detect these problems, we find that 5 existing faithfulness evaluation metrics for summarization have poor correlations with human judgment. To remedy this, we propose a new metric, ExtEval, that is designed for detecting unfaithful extractive summaries and is shown to have the best performance. We hope our work can increase the awareness of unfaithfulness problems in extractive summarization and help future work to evaluate and resolve these issues. Our data and code are publicly available at https://github.com/ZhangShiyue/extractive_is_not_faithful △ Less

Submitted 29 May, 2023; v1 submitted 7 September, 2022; originally announced September 2022.

Comments: ACL 2023 (20 pages; the first 2 authors contributed equally)

arXiv:2207.12584 [pdf, ps, other]

On Deep Holes of Elliptic Curve Codes

Authors: Jun Zhang, Daqing Wan

Abstract: We give a method to construct deep holes for elliptic curve codes. For long elliptic curve codes, we conjecture that our construction is complete in the sense that it gives all deep holes. Some evidence and heuristics on the completeness are provided via the connection with problems and results in finite geometry. We give a method to construct deep holes for elliptic curve codes. For long elliptic curve codes, we conjecture that our construction is complete in the sense that it gives all deep holes. Some evidence and heuristics on the completeness are provided via the connection with problems and results in finite geometry. △ Less

Submitted 25 July, 2022; originally announced July 2022.

Comments: 19 pages

arXiv:2205.07830 [pdf, other]

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

Authors: David Wan, Mohit Bansal

Abstract: We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning: (1) We augment the sentence selection strategy of PEGASUS's (Zhang et al., 2020) pre-training objective to create pseudo-summaries that are both important and factual; (2) We introduce three complementary components for fine-tuning. The corrector removes hallucin… ▽ More We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning: (1) We augment the sentence selection strategy of PEGASUS's (Zhang et al., 2020) pre-training objective to create pseudo-summaries that are both important and factual; (2) We introduce three complementary components for fine-tuning. The corrector removes hallucinations present in the reference summary, the contrastor uses contrastive learning to better differentiate nonfactual summaries from factual ones, and the connector bridges the gap between the pre-training and fine-tuning for better transfer of knowledge. Experiments on three downstream tasks demonstrate that FactPEGASUS substantially improves factuality evaluated by multiple automatic metrics and humans. Our thorough analysis suggests that FactPEGASUS is more factual than using the original pre-training objective in zero-shot and few-shot settings, retains factual behavior more robustly than strong baselines, and does not rely entirely on becoming more extractive to improve factuality. Our code and data are publicly available at: https://github.com/meetdavidwan/factpegasus △ Less

Submitted 16 May, 2022; originally announced May 2022.

Comments: NAACL 2022 (19 pages)

arXiv:2204.07725 [pdf, other]

doi 10.1016/j.pmcj.2021.101434

Is Blockchain for Internet of Medical Things a Panacea for COVID-19 Pandemic?

Authors: Xuran Li, Bishenghui Tao, Hong-Ning Dai, Muhammad Imran, Dehuan Wan, Dengwang Li

Abstract: The outbreak of the COVID-19 pandemic has deeply influenced the lifestyle of the general public and the healthcare system of the society. As a promising approach to address the emerging challenges caused by the epidemic of infectious diseases like COVID-19, Internet of Medical Things (IoMT) deployed in hospitals, clinics, and healthcare centers can save the diagnosis time and improve the efficienc… ▽ More The outbreak of the COVID-19 pandemic has deeply influenced the lifestyle of the general public and the healthcare system of the society. As a promising approach to address the emerging challenges caused by the epidemic of infectious diseases like COVID-19, Internet of Medical Things (IoMT) deployed in hospitals, clinics, and healthcare centers can save the diagnosis time and improve the efficiency of medical resources though privacy and security concerns of IoMT stall the wide adoption. In order to tackle the privacy, security, and interoperability issues of IoMT, we propose a framework of blockchain-enabled IoMT by introducing blockchain to incumbent IoMT systems. In this paper, we review the benefits of this architecture and illustrate the opportunities brought by blockchain-enabled IoMT. We also provide use cases of blockchain-enabled IoMT on fighting against the COVID-19 pandemic, including the prevention of infectious diseases, location sharing and contact tracing, and the supply chain of injectable medicines. We also outline future work in this area. △ Less

Submitted 16 April, 2022; originally announced April 2022.

Comments: 15 pages, 8 figures

ACM Class: J.3; C.2.4

Journal ref: Pervasive and Mobile Computing, 2021

arXiv:2106.07708 [pdf]

CathAI: Fully Automated Interpretation of Coronary Angiograms Using Neural Networks

Authors: Robert Avram, Jeffrey E. Olgin, Alvin Wan, Zeeshan Ahmed, Louis Verreault-Julien, Sean Abreau, Derek Wan, Joseph E. Gonzalez, Derek Y. So, Krishan Soni, Geoffrey H. Tison

Abstract: Coronary heart disease (CHD) is the leading cause of adult death in the United States and worldwide, and for which the coronary angiography procedure is the primary gateway for diagnosis and clinical management decisions. The standard-of-care for interpretation of coronary angiograms depends upon ad-hoc visual assessment by the physician operator. However, ad-hoc visual interpretation of angiogram… ▽ More Coronary heart disease (CHD) is the leading cause of adult death in the United States and worldwide, and for which the coronary angiography procedure is the primary gateway for diagnosis and clinical management decisions. The standard-of-care for interpretation of coronary angiograms depends upon ad-hoc visual assessment by the physician operator. However, ad-hoc visual interpretation of angiograms is poorly reproducible, highly variable and bias prone. Here we show for the first time that fully-automated angiogram interpretation to estimate coronary artery stenosis is possible using a sequence of deep neural network algorithms. The algorithmic pipeline we developed--called CathAI--achieves state-of-the art performance across the sequence of tasks required to accomplish automated interpretation of unselected, real-world angiograms. CathAI (Algorithms 1-2) demonstrated positive predictive value, sensitivity and F1 score of >=90% to identify the projection angle overall and >=93% for left or right coronary artery angiogram detection, the primary anatomic structures of interest. To predict obstructive coronary artery stenosis (>=70% stenosis), CathAI (Algorithm 4) exhibited an area under the receiver operating characteristic curve (AUC) of 0.862 (95% CI: 0.843-0.880). When externally validated in a healthcare system in another country, CathAI AUC was 0.869 (95% CI: 0.830-0.907) to predict obstructive coronary artery stenosis. Our results demonstrate that multiple purpose-built neural networks can function in sequence to accomplish the complex series of tasks required for automated analysis of real-world angiograms. Deployment of CathAI may serve to increase standardization and reproducibility in coronary stenosis assessment, while providing a robust foundation to accomplish future tasks for algorithmic angiographic interpretation. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Comments: 62 pages, 3 main figures, 2 main tables

ACM Class: I.4.9; I.2.10; J.3

arXiv:2104.07868 [pdf, other]

Segmenting Subtitles for Correcting ASR Segmentation Errors

Authors: David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuščáková, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown

Abstract: Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks.… ▽ More Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks. We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic utterances by modeling common error modes. We train a neural tagging model for correcting ASR acoustic segmentation and show that it improves downstream performance on MT and audio-document cross-language information retrieval (CLIR). △ Less

Submitted 15 April, 2021; originally announced April 2021.

arXiv:2102.00617 [pdf]

The Controllability of Planning, Responsibility, and Security in Automatic Driving Technology

Authors: Dan Wan, Hao Zhan

Abstract: People hope automated driving technology is always in a stable and controllable state; specifically, it can be divided into controllable planning, controllable responsibility, and controllable information. When this controllability is undermined, it brings about the problems, e.g., trolley dilemma, responsibility attribution, information leakage, and security. This article discusses these three ty… ▽ More People hope automated driving technology is always in a stable and controllable state; specifically, it can be divided into controllable planning, controllable responsibility, and controllable information. When this controllability is undermined, it brings about the problems, e.g., trolley dilemma, responsibility attribution, information leakage, and security. This article discusses these three types of issues separately and clarifies the misunderstandings. △ Less

Submitted 31 January, 2021; originally announced February 2021.

Comments: 49th International Conference on Computers and Industrial Engineering, CIE 2019. arXiv admin note: substantial text overlap with arXiv:1906.07861

arXiv:2012.15262 [pdf, other]

Robustness Testing of Language Understanding in Task-Oriented Dialog

Authors: Jiexi Liu, Ryuichi Takanobu, Jiaxin Wen, Dazhen Wan, Hongguang Li, Weiran Nie, Cheng Li, Wei Peng, Minlie Huang

Abstract: Most language understanding models in task-oriented dialog systems are trained on a small amount of annotated training data, and evaluated in a small set from the same distribution. However, these models can lead to system failure or undesirable output when being exposed to natural language perturbation or variation in practice. In this paper, we conduct comprehensive evaluation and analysis with… ▽ More Most language understanding models in task-oriented dialog systems are trained on a small amount of annotated training data, and evaluated in a small set from the same distribution. However, these models can lead to system failure or undesirable output when being exposed to natural language perturbation or variation in practice. In this paper, we conduct comprehensive evaluation and analysis with respect to the robustness of natural language understanding models, and introduce three important aspects related to language understanding in real-world dialog systems, namely, language variety, speech characteristics, and noise perturbation. We propose a model-agnostic toolkit LAUG to approximate natural language perturbations for testing the robustness issues in task-oriented dialog. Four data augmentation approaches covering the three aspects are assembled in LAUG, which reveals critical robustness issues in state-of-the-art models. The augmented dataset through LAUG can be used to facilitate future research on the robustness testing of language understanding in task-oriented dialog. △ Less

Submitted 4 June, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

Comments: ACL 2021 long paper

arXiv:2010.09693 [pdf, other]

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

Authors: David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathleen McKeown

Abstract: In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation. ASR output segmentation is crucial, as ASR systems segment the input audio using purely acoustic information and are not guaranteed to output sentence-like segments. Since most MT systems expect sentences as input, feeding in longer unsegmented passages can lead to sub-op… ▽ More In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation. ASR output segmentation is crucial, as ASR systems segment the input audio using purely acoustic information and are not guaranteed to output sentence-like segments. Since most MT systems expect sentences as input, feeding in longer unsegmented passages can lead to sub-optimal performance. We explore the feasibility of using datasets of subtitles from TV shows and movies to train better ASR segmentation models. We further incorporate part-of-speech (POS) tag and dependency label information (derived from the unsegmented ASR outputs) into our segmentation model. We show that this noisy syntactic information can improve model accuracy. We evaluate our models intrinsically on segmentation quality and extrinsically on downstream MT performance, as well as downstream tasks including cross-lingual information retrieval (CLIR) tasks and human relevance assessments. Our model shows improved performance on downstream tasks for Lithuanian and Bulgarian. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Journal ref: CLSST@LREC 2020 68-73

arXiv:2010.09608 [pdf, other]

Incorporating Terminology Constraints in Automatic Post-Editing

Authors: David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat, Kathleen McKeown

Abstract: Users of machine translation (MT) may want to ensure the use of specific lexical terminologies. While there exist techniques for incorporating terminology constraints during inference for MT, current APE approaches cannot ensure that they will appear in the final translation. In this paper, we present both autoregressive and non-autoregressive models for lexically constrained APE, demonstrating th… ▽ More Users of machine translation (MT) may want to ensure the use of specific lexical terminologies. While there exist techniques for incorporating terminology constraints during inference for MT, current APE approaches cannot ensure that they will appear in the final translation. In this paper, we present both autoregressive and non-autoregressive models for lexically constrained APE, demonstrating that our approach enables preservation of 95% of the terminologies and also improves translation quality on English-German benchmarks. Even when applied to lexically constrained MT output, our approach is able to improve preservation of the terminologies. However, we show that our models do not learn to copy constraints systematically and suggest a simple data augmentation technique that leads to improved performance and robustness. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: To appear in WMT, 2020

arXiv:2010.05594 [pdf, other]

MultiWOZ 2.3: A multi-domain task-oriented dialogue dataset enhanced with annotation corrections and co-reference annotation

Authors: Ting Han, Ximing Liu, Ryuichi Takanobu, Yixin Lian, Chongxuan Huang, Dazhen Wan, Wei Peng, Minlie Huang

Abstract: Task-oriented dialogue systems have made unprecedented progress with multiple state-of-the-art (SOTA) models underpinned by a number of publicly available MultiWOZ datasets. Dialogue state annotations are error-prone, leading to sub-optimal performance. Various efforts have been put in rectifying the annotation errors presented in the original MultiWOZ dataset. In this paper, we introduce MultiWOZ… ▽ More Task-oriented dialogue systems have made unprecedented progress with multiple state-of-the-art (SOTA) models underpinned by a number of publicly available MultiWOZ datasets. Dialogue state annotations are error-prone, leading to sub-optimal performance. Various efforts have been put in rectifying the annotation errors presented in the original MultiWOZ dataset. In this paper, we introduce MultiWOZ 2.3, in which we differentiate incorrect annotations in dialogue acts from dialogue states, identifying a lack of co-reference when publishing the updated dataset. To ensure consistency between dialogue acts and dialogue states, we implement co-reference features and unify annotations of dialogue acts and dialogue states. We update the state of the art performance of natural language understanding and dialogue state tracking on MultiWOZ 2.3, where the results show significant improvements than on previous versions of MultiWOZ datasets (2.0-2.2). △ Less

Submitted 14 June, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

arXiv:2007.13214 [pdf, other]

Computing zeta functions of large polynomial systems over finite fields

Authors: Qi Cheng, J. Maurice Rojas, Daqing Wan

Abstract: In this paper, we improve the algorithms of Lauder-Wan \cite{LW} and Harvey \cite{Ha} to compute the zeta function of a system of $m$ polynomial equations in $n$ variables over the finite field $\FF_q$ of $q$ elements, for $m$ large. The dependence on $m$ in the original algorithms was exponential in $m$. Our main result is a reduction of the exponential dependence on $m$ to a polynomial dependenc… ▽ More In this paper, we improve the algorithms of Lauder-Wan \cite{LW} and Harvey \cite{Ha} to compute the zeta function of a system of $m$ polynomial equations in $n$ variables over the finite field $\FF_q$ of $q$ elements, for $m$ large. The dependence on $m$ in the original algorithms was exponential in $m$. Our main result is a reduction of the exponential dependence on $m$ to a polynomial dependence on $m$. As an application, we speed up a doubly exponential time algorithm from a software verification paper \cite{BJK} (on universal equivalence of programs over finite fields) to singly exponential time. One key new ingredient is an effective version of the classical Kronecker theorem which (set-theoretically) reduces the number of defining equations for a "large" polynomial system over $\FF_q$ when $q$ is suitably large. △ Less

Submitted 26 July, 2020; originally announced July 2020.

arXiv:2007.11162 [pdf, ps, other]

doi 10.1016/j.disc.2020.112072

Rational points on complete symmetric hypersurfaces over finite fields

Authors: Jun Zhang, Daqing Wan

Abstract: For any affine hypersurface defined by a complete symmetric polynomial in $k\geq 3$ variables of degree $m$ over the finite field $\mathbb{F}_{q}$ of $q$ elements, a special case of our theorem says that this hypersurface has at least $6q^{k-3}$ rational points over $\mathbb{F}_{q}$ if $1\leq m \leq q-3$ and $q$ is odd. A key ingredient in our proof is Segre's classical theorem on ovals in finite… ▽ More For any affine hypersurface defined by a complete symmetric polynomial in $k\geq 3$ variables of degree $m$ over the finite field $\mathbb{F}_{q}$ of $q$ elements, a special case of our theorem says that this hypersurface has at least $6q^{k-3}$ rational points over $\mathbb{F}_{q}$ if $1\leq m \leq q-3$ and $q$ is odd. A key ingredient in our proof is Segre's classical theorem on ovals in finite projective planes. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Comments: 15 pages

Journal ref: Discrete Mathematics (2020) vol.343(11)

arXiv:2004.00917 [pdf, other]

Controllable Orthogonalization in Training DNNs

Authors: Lei Huang, Li Liu, Fan Zhu, Diwen Wan, Zehuan Yuan, Bo Li, Ling Shao

Abstract: Orthogonality is widely used for training deep neural networks (DNNs) due to its ability to maintain all singular values of the Jacobian close to 1 and reduce redundancy in representation. This paper proposes a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI), to learn a layer-wise orthogonal weight matrix in DNNs. ONI works by iteratively st… ▽ More Orthogonality is widely used for training deep neural networks (DNNs) due to its ability to maintain all singular values of the Jacobian close to 1 and reduce redundancy in representation. This paper proposes a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI), to learn a layer-wise orthogonal weight matrix in DNNs. ONI works by iteratively stretching the singular values of a weight matrix towards 1. This property enables it to control the orthogonality of a weight matrix by its number of iterations. We show that our method improves the performance of image classification networks by effectively controlling the orthogonality to provide an optimal tradeoff between optimization benefits and representational capacity reduction. We also show that ONI stabilizes the training of generative adversarial networks (GANs) by maintaining the Lipschitz continuity of a network, similar to spectral normalization (SN), and further outperforms SN by providing controllable orthogonality. △ Less

Submitted 2 April, 2020; originally announced April 2020.

Comments: Accepted to CVPR 2020. The Code is available at https://github.com/huangleiBuaa/ONI

arXiv:2002.00583 [pdf, other]

CoTK: An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation

Authors: Fei Huang, Dazhen Wan, Zhihong Shao, Pei Ke, Jian Guan, Yilin Niu, Xiaoyan Zhu, Minlie Huang

Abstract: In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions. We present CoTK, an open-source toolkit aiming to support fast development and fair evaluation of text generation. In model development, CoTK helps handle the cumbersome issues, such as data processin… ▽ More In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions. We present CoTK, an open-source toolkit aiming to support fast development and fair evaluation of text generation. In model development, CoTK helps handle the cumbersome issues, such as data processing, metric implementation, and reproduction. It standardizes the development steps and reduces human errors which may lead to inconsistent experimental settings. In model evaluation, CoTK provides implementation for many commonly used metrics and benchmark models across different experimental settings. As a unique feature, CoTK can signify when and which metric cannot be fairly compared. We demonstrate that it is convenient to use CoTK for model development and evaluation, particularly across different experimental settings. △ Less

Submitted 3 February, 2020; originally announced February 2020.

Comments: Submitting to ACL2020 demo

ACM Class: I.2.7

arXiv:1906.07861 [pdf]

Controllable Planning, Responsibility, and Information in Automatic Driving Technology

Authors: Dan Wan, Hao Zhan

Abstract: People hope automated driving technology should be always in a stable and controllable state, accurately, which can be divided into controllable planning, responsibility, and information. Otherwise, it would bring about the problems of tram dilemma, responsibility attribution, information leakage, and security. This article discusses these three types of issues separately and clarifies some misund… ▽ More People hope automated driving technology should be always in a stable and controllable state, accurately, which can be divided into controllable planning, responsibility, and information. Otherwise, it would bring about the problems of tram dilemma, responsibility attribution, information leakage, and security. This article discusses these three types of issues separately and clarifies some misunderstandings. △ Less

Submitted 27 June, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

Comments: The 7th International Symposium on Project Management (ISPM2019)

arXiv:1905.03035 [pdf]

Applications of Social Media in Hydroinformatics: A Survey

Authors: Yufeng Yu, Yuelong Zhu, Dingsheng Wan, Qun Zhao, Kai Shu, Huan Liu

Abstract: Floods of research and practical applications employ social media data for a wide range of public applications, including environmental monitoring, water resource managing, disaster and emergency response.Hydroinformatics can benefit from the social media technologies with newly emerged data, techniques and analytical tools to handle large datasets, from which creative ideas and new values could b… ▽ More Floods of research and practical applications employ social media data for a wide range of public applications, including environmental monitoring, water resource managing, disaster and emergency response.Hydroinformatics can benefit from the social media technologies with newly emerged data, techniques and analytical tools to handle large datasets, from which creative ideas and new values could be mined.This paper first proposes a 4W (What, Why, When, hoW) model and a methodological structure to better understand and represent the application of social media to hydroinformatics, then provides an overview of academic research of applying social media to hydroinformatics such as water environment, water resources, flood, drought and water Scarcity management. At last,some advanced topics and suggestions of water related social media applications from data collection, data quality management, fake news detection, privacy issues, algorithms and platforms was present to hydroinformatics managers and researchers based on previous discussion. △ Less

Submitted 1 May, 2019; originally announced May 2019.

Comments: 37pages

arXiv:1905.00421 [pdf]

A Novel Trend Symbolic Aggregate Approximation for Time Series

Authors: Yufeng Yu, Yuelong Zhu, Dingsheng Wan, Qun Zhao, Huan Liu

Abstract: Symbolic Aggregate approximation (SAX) is a classical symbolic approach in many time series data mining applications. However, SAX only reflects the segment mean value feature and misses important information in a segment, namely the trend of the value change in the segment. Such a miss may cause a wrong classification in some cases, since the SAX representation cannot distinguish different time s… ▽ More Symbolic Aggregate approximation (SAX) is a classical symbolic approach in many time series data mining applications. However, SAX only reflects the segment mean value feature and misses important information in a segment, namely the trend of the value change in the segment. Such a miss may cause a wrong classification in some cases, since the SAX representation cannot distinguish different time series with similar average values but different trends. In this paper, we present Trend Feature Symbolic Aggregate approximation (TFSAX) to solve this problem. First, we utilize Piecewise Aggregate Approximation (PAA) approach to reduce dimensionality and discretize the mean value of each segment by SAX. Second, extract trend feature in each segment by using trend distance factor and trend shape factor. Then, design multi-resolution symbolic mapping rules to discretize trend information into symbols. We also propose a modified distance measure by integrating the SAX distance with a weighted trend distance. We show that our distance measure has a tighter lower bound to the Euclidean distance than that of the original SAX. The experimental results on diverse time series data sets demonstrate that our proposed representation significantly outperforms the original SAX representation and an improved SAX representation for classification. △ Less

Submitted 1 May, 2019; originally announced May 2019.

Comments: 9 pages,ACM_IMCOM2019_CFP

arXiv:1901.05445 [pdf, ps, other]

Deep Holes of Projective Reed-Solomon Codes

Authors: Jun Zhang, Daqing Wan, Krishna Kaipa

Abstract: Projective Reed-Solomon (PRS) codes are Reed-Solomon codes of the maximum possible length q+1. The classification of deep holes --received words with maximum possible error distance-- for PRS codes is an important and difficult problem. In this paper, we use algebraic methods to explicitly construct three classes of deep holes for PRS codes. We show that these three classes completely classify all… ▽ More Projective Reed-Solomon (PRS) codes are Reed-Solomon codes of the maximum possible length q+1. The classification of deep holes --received words with maximum possible error distance-- for PRS codes is an important and difficult problem. In this paper, we use algebraic methods to explicitly construct three classes of deep holes for PRS codes. We show that these three classes completely classify all deep holes of PRS codes with redundancy at most four. Previously, the deep hole classification was only known for PRS codes with redundancy at most three in work arXiv:1612.05447 △ Less

Submitted 3 September, 2019; v1 submitted 16 January, 2019; originally announced January 2019.

Comments: to appear in IEEE Transactions on Information Theory

MSC Class: 11T71; 94B27

arXiv:1809.00699 [pdf, other]

Multi-Level Structured Self-Attentions for Distantly Supervised Relation Extraction

Authors: Jinhua Du, Jingguang Han, Andy Way, Dadong Wan

Abstract: Attention mechanisms are often used in deep neural networks for distantly supervised relation extraction (DS-RE) to distinguish valid from noisy instances. However, traditional 1-D vector attention models are insufficient for the learning of different contexts in the selection of valid instances to predict the relationship for an entity pair. To alleviate this issue, we propose a novel multi-level… ▽ More Attention mechanisms are often used in deep neural networks for distantly supervised relation extraction (DS-RE) to distinguish valid from noisy instances. However, traditional 1-D vector attention models are insufficient for the learning of different contexts in the selection of valid instances to predict the relationship for an entity pair. To alleviate this issue, we propose a novel multi-level structured (2-D matrix) self-attention mechanism for DS-RE in a multi-instance learning (MIL) framework using bidirectional recurrent neural networks. In the proposed method, a structured word-level self-attention mechanism learns a 2-D matrix where each row vector represents a weight distribution for different aspects of an instance regarding two entities. Targeting the MIL issue, the structured sentence-level attention learns a 2-D matrix where each row vector represents a weight distribution on selection of different valid in-stances. Experiments conducted on two publicly available DS-RE datasets show that the proposed framework with a multi-level structured self-attention mechanism significantly outperform state-of-the-art baselines in terms of PR curves, P@N and F1 measures. △ Less

Submitted 3 September, 2018; originally announced September 2018.

Comments: Accepted by EMNLP2018

arXiv:1806.00152 [pdf, ps, other]

Distance Distribution to Received Words in Reed-Solomon Codes

Authors: Jiyou Li, Daqing Wan

Abstract: Let $\mathbb{F}_q$ be the finite field of $q$ elements. In this paper we obtain bounds on the following counting problem: given a polynomial $f(x)\in \mathbb{F}_q[x]$ of degree $k+m$ and a non-negative integer $r$, count the number of polynomials $g(x)\in \mathbb{F}_q[x]$ of degree at most $k-1$ such that $f(x)+g(x)$ has exactly $r$ roots in $\mathbb{F}_q$. Previously, explicit formulas were known… ▽ More Let $\mathbb{F}_q$ be the finite field of $q$ elements. In this paper we obtain bounds on the following counting problem: given a polynomial $f(x)\in \mathbb{F}_q[x]$ of degree $k+m$ and a non-negative integer $r$, count the number of polynomials $g(x)\in \mathbb{F}_q[x]$ of degree at most $k-1$ such that $f(x)+g(x)$ has exactly $r$ roots in $\mathbb{F}_q$. Previously, explicit formulas were known only for the cases $m=0, 1, 2$. As an application, we obtain an asymptotic formula on the list size of the standard Reed-Solomon code $[q, k, q-k+1]_q$. △ Less

Submitted 29 July, 2019; v1 submitted 31 May, 2018; originally announced June 2018.

Comments: 15 pages

arXiv:1801.04650 [pdf, ps, other]

Non-Orthogonal Multiple Access For Cooperative Communications: Challenges, Opportunities, And Trends

Authors: Dehuan Wan, Miaowen Wen, Fei Ji, Hua Yu, Fangjiong Chen

Abstract: Non-orthogonal multiple access (NOMA) is a promising radio access technique for next-generation wireless networks. In this article, we investigate the NOMA-based cooperative relay network. We begin with an introduction of the existing relay-assisted NOMA systems by classifying them into three categories: uplink, downlink, and composite architectures. Then, we discuss their principles and key featu… ▽ More Non-orthogonal multiple access (NOMA) is a promising radio access technique for next-generation wireless networks. In this article, we investigate the NOMA-based cooperative relay network. We begin with an introduction of the existing relay-assisted NOMA systems by classifying them into three categories: uplink, downlink, and composite architectures. Then, we discuss their principles and key features, and provide a comprehensive comparison from the perspective of spectral efficiency, energy efficiency, and total transmit power. A novel strategy termed hybrid power allocation is further discussed for the composite architecture, which can reduce the computational complexity and signaling overhead at the expense of marginal sum rate degradation. Finally, major challenges, opportunities, and future research trends for the design of NOMA-based cooperative relay systems with other techniques are also highlighted to provide insights for researchers in this field. △ Less

Submitted 14 January, 2018; originally announced January 2018.

arXiv:1711.11202 [pdf, ps, other]

On deep-holes of Gabidulin codes

Authors: Weijun Fang, Li-Ping Wang, Daqing Wan

Abstract: In this paper, we determine the covering radius and a class of deep holes for Gabidulin codes with both rank metric and Hamming metric. Moreover, we give a necessary and sufficient condition for deciding whether a word is not a deep hole for Gabidulin codes, by which we study the error distance of a special class of words to certain Gabidulin codes. In this paper, we determine the covering radius and a class of deep holes for Gabidulin codes with both rank metric and Hamming metric. Moreover, we give a necessary and sufficient condition for deciding whether a word is not a deep hole for Gabidulin codes, by which we study the error distance of a special class of words to certain Gabidulin codes. △ Less

Submitted 8 September, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

Comments: Published in Finite Fields and Their Applications

arXiv:1711.02292 [pdf, ps, other]

Explicit Deep Holes of Reed-Solomon Codes

Authors: Jun Zhang, Daqing Wan

Abstract: In this paper, deep holes of Reed-Solomon (RS) codes are studied. A new class of deep holes for generalized affine RS codes is given if the evaluation set satisfies certain combinatorial structure. Three classes of deep holes for projective Reed-Solomon (PRS) codes are constructed explicitly. In particular, deep holes of PRS codes with redundancy three are completely obtained when the characterist… ▽ More In this paper, deep holes of Reed-Solomon (RS) codes are studied. A new class of deep holes for generalized affine RS codes is given if the evaluation set satisfies certain combinatorial structure. Three classes of deep holes for projective Reed-Solomon (PRS) codes are constructed explicitly. In particular, deep holes of PRS codes with redundancy three are completely obtained when the characteristic of the finite field is odd. Most (asymptotically of ratio $1$) of the deep holes of PRS codes with redundancy four are also obtained. △ Less

Submitted 7 November, 2017; originally announced November 2017.

Comments: 31 pages

MSC Class: 94B05

arXiv:1711.01355 [pdf, ps, other]

doi 10.2140/obs.2019.2.191

Counting Roots of Polynomials Over Prime Power Rings

Authors: Qi Cheng, Shuhong Gao, J. Maurice Rojas, Daqing Wan

Abstract: Suppose $p$ is a prime, $t$ is a positive integer, and $f\!\in\!\mathbb{Z}[x]$ is a univariate polynomial of degree $d$ with coefficients of absolute value $<\!p^t$. We show that for any fixed $t$, we can compute the number of roots in $\mathbb{Z}/(p^t)$ of $f$ in deterministic time $(d+\log p)^{O(1)}$. This fixed parameter tractability appears to be new for $t\!\geq\!3$. A consequence for arithme… ▽ More Suppose $p$ is a prime, $t$ is a positive integer, and $f\!\in\!\mathbb{Z}[x]$ is a univariate polynomial of degree $d$ with coefficients of absolute value $<\!p^t$. We show that for any fixed $t$, we can compute the number of roots in $\mathbb{Z}/(p^t)$ of $f$ in deterministic time $(d+\log p)^{O(1)}$. This fixed parameter tractability appears to be new for $t\!\geq\!3$. A consequence for arithmetic geometry is that we can efficiently compute Igusa zeta functions $Z$, for univariate polynomials, assuming the degree of $Z$ is fixed. △ Less

Submitted 3 November, 2017; originally announced November 2017.

Comments: title page, plus 11 pages, no illustrations, submitted to a conference

Journal ref: Open Book Series 2 (2019) 191-205

arXiv:1605.02423 [pdf, ps, other]

On Deep Holes of Projective Reed-Solomon Codes

Authors: Jun Zhang, Daqing Wan

Abstract: In this paper, we obtain new results on the covering radius and deep holes for projective Reed-Solomon (PRS) codes. In this paper, we obtain new results on the covering radius and deep holes for projective Reed-Solomon (PRS) codes. △ Less

Submitted 9 May, 2016; originally announced May 2016.

Comments: 5 pages, accepted by The 2016 IEEE International Symposium on Information Theory (ISIT2016)

MSC Class: 94B05

arXiv:1507.00988 [pdf, ps, other]

Index bounds for character sums with polynomials over finite fields

Authors: Daqing Wan, Qiang Wang

Abstract: We provide an index bound for character sums of polynomials over finite fields. This improves the Weil bound for high degree polynomials with small indices, as well as polynomials with large indices that are generated by cyclotomic mappings of small indices. As an application, we also give some general bounds for numbers of solutions of some Artin-Schreier equations and mininum weights of some cyc… ▽ More We provide an index bound for character sums of polynomials over finite fields. This improves the Weil bound for high degree polynomials with small indices, as well as polynomials with large indices that are generated by cyclotomic mappings of small indices. As an application, we also give some general bounds for numbers of solutions of some Artin-Schreier equations and mininum weights of some cyclic codes. △ Less

Submitted 3 July, 2015; originally announced July 2015.

MSC Class: 11T24

arXiv:1501.01138 [pdf, ps, other]

On the minimum distance of elliptic curve codes

Authors: Jiyou Li, Daqing Wan, Jun Zhang

Abstract: Computing the minimum distance of a linear code is one of the fundamental problems in algorithmic coding theory. Vardy [14] showed that it is an \np-hard problem for general linear codes. In practice, one often uses codes with additional mathematical structure, such as AG codes. For AG codes of genus $0$ (generalized Reed-Solomon codes), the minimum distance has a simple explicit formula. An inter… ▽ More Computing the minimum distance of a linear code is one of the fundamental problems in algorithmic coding theory. Vardy [14] showed that it is an \np-hard problem for general linear codes. In practice, one often uses codes with additional mathematical structure, such as AG codes. For AG codes of genus $0$ (generalized Reed-Solomon codes), the minimum distance has a simple explicit formula. An interesting result of Cheng [3] says that the minimum distance problem is already \np-hard (under \rp-reduction) for general elliptic curve codes (ECAG codes, or AG codes of genus $1$). In this paper, we show that the minimum distance of ECAG codes also has a simple explicit formula if the evaluation set is suitably large (at least $2/3$ of the group order). Our method is purely combinatorial and based on a new sieving technique from the first two authors [8]. This method also proves a significantly stronger version of the MDS (maximum distance separable) conjecture for ECAG codes. △ Less

Submitted 7 January, 2015; v1 submitted 6 January, 2015; originally announced January 2015.

Comments: 13 pages

arXiv:1411.6346 [pdf, ps, other]

Sparse Univariate Polynomials with Many Roots Over Finite Fields

Authors: Qi Cheng, Shuhong Gao, J. Maurice Rojas, Daqing Wan

Abstract: Suppose $q$ is a prime power and $f\in\mathbb{F}_q[x]$ is a univariate polynomial with exactly $t$ monomial terms and degree $<q-1$. To establish a finite field analogue of Descartes' Rule, Bi, Cheng, and Rojas (2013) proved an upper bound of $2(q-1)^{\frac{t-2}{t-1}}$ on the number of cosets in $\mathbb{F}^*_q$ needed to cover the roots of $f$ in $\mathbb{F}^*_q$. Here, we give explicit $f$ with… ▽ More Suppose $q$ is a prime power and $f\in\mathbb{F}_q[x]$ is a univariate polynomial with exactly $t$ monomial terms and degree $<q-1$. To establish a finite field analogue of Descartes' Rule, Bi, Cheng, and Rojas (2013) proved an upper bound of $2(q-1)^{\frac{t-2}{t-1}}$ on the number of cosets in $\mathbb{F}^*_q$ needed to cover the roots of $f$ in $\mathbb{F}^*_q$. Here, we give explicit $f$ with root structure approaching this bound: For $q$ a $(t-1)$-st power of a prime we give an explicit $t$-nomial vanishing on $q^{\frac{t-2}{t-1}}$ distinct cosets of $\mathbb{F}^*_q$. Over prime fields $\mathbb{F}_p$, computational data we provide suggests that it is harder to construct explicit sparse polynomials with many roots. Nevertheless, assuming the Generalized Riemann Hypothesis, we find explicit trinomials having $Ω\left(\frac{\log p}{\log \log p}\right)$ distinct roots in $\mathbb{F}_p$. △ Less

Submitted 6 July, 2016; v1 submitted 23 November, 2014; originally announced November 2014.

Comments: 9 pages, 1 figure, presented at MEGA 2015. This is the journal version, and includes new extremal examples and additional references, including pointers to recent advances by Kelley and Owen. Comments and questions welcome

arXiv:1310.5124 [pdf, ps, other]

doi 10.1112/S1461157014000242

Traps to the BGJT-Algorithm for Discrete Logarithms

Authors: Qi Cheng, Daqing Wan, Jincheng Zhuang

Abstract: In the recent breakthrough paper by Barbulescu, Gaudry, Joux and Thom{é}, a quasi-polynomial time algorithm (QPA) is proposed for the discrete logarithm problem over finite fields of small characteristic. The time complexity analysis of the algorithm is based on several heuristics presented in their paper. We show that some of the heuristics are problematic in their original forms, in particular,… ▽ More In the recent breakthrough paper by Barbulescu, Gaudry, Joux and Thom{é}, a quasi-polynomial time algorithm (QPA) is proposed for the discrete logarithm problem over finite fields of small characteristic. The time complexity analysis of the algorithm is based on several heuristics presented in their paper. We show that some of the heuristics are problematic in their original forms, in particular, when the field is not a Kummer extension. We believe that the basic idea behind the new approach should still work, and propose a fix to the algorithm in non-Kummer cases, without altering the quasi-polynomial time complexity. The modified algorithm is also heuristic. Further study is required in order to fully understand the effectiveness of the new approach. △ Less

Submitted 18 October, 2013; originally announced October 2013.

MSC Class: 11Y16

arXiv:1304.7402 [pdf, ps, other]

Stopping Sets of Algebraic Geometry Codes

Authors: Jun Zhang, Fang-Wei Fu, Daqing Wan

Abstract: Stopping sets and stopping set distribution of a linear code play an important role in the performance analysis of iterative decoding for this linear code. Let $C$ be an $[n,k]$ linear code over $\f$ with parity-check matrix $H$, where the rows of $H$ may be dependent. Let $[n]=\{1,2,...,n\}$ denote the set of column indices of $H$. A \emph{stopping set} $S$ of $C$ with parity-check matrix $H$ is… ▽ More Stopping sets and stopping set distribution of a linear code play an important role in the performance analysis of iterative decoding for this linear code. Let $C$ be an $[n,k]$ linear code over $\f$ with parity-check matrix $H$, where the rows of $H$ may be dependent. Let $[n]=\{1,2,...,n\}$ denote the set of column indices of $H$. A \emph{stopping set} $S$ of $C$ with parity-check matrix $H$ is a subset of $[n]$ such that the restriction of $H$ to $S$ does not contain a row of weight 1. The \emph{stopping set distribution} $\{T_{i}(H)\}_{i=0}^{n}$ enumerates the number of stopping sets with size $i$ of $C$ with parity-check matrix $H$. Denote $H^{*}$ the parity-check matrix consisting of all the non-zero codewords in the dual code $C^{\bot}$. In this paper, we study stopping sets and stopping set distributions of some residue algebraic geometry (AG) codes with parity-check matrix $H^*$. First, we give two descriptions of stopping sets of residue AG codes. For the simplest AG codes, i.e., the generalized Reed-Solomon codes, it is easy to determine all the stopping sets. Then we consider AG codes from elliptic curves. We use the group structure of rational points of elliptic curves to present a complete characterization of stopping sets. Then the stopping sets, the stopping set distribution and the stopping distance of the AG code from an elliptic curve are reduced to the search, counting and decision versions of the subset sum problem in the group of rational points of the elliptic curve, respectively. Finally, for some special cases, we determine the stopping set distributions of AG codes from elliptic curves. △ Less

Submitted 27 April, 2013; originally announced April 2013.

Comments: 17 pages

MSC Class: 11T71

arXiv:1111.1224 [pdf, ps, other]

Counting Value Sets: Algorithm and Complexity

Authors: Qi Cheng, Joshua E. Hill, Daqing Wan

Abstract: Let $p$ be a prime. Given a polynomial in $\F_{p^m}[x]$ of degree $d$ over the finite field $\F_{p^m}$, one can view it as a map from $\F_{p^m}$ to $\F_{p^m}$, and examine the image of this map, also known as the value set. In this paper, we present the first non-trivial algorithm and the first complexity result on computing the cardinality of this value set. We show an elementary connection betwe… ▽ More Let $p$ be a prime. Given a polynomial in $\F_{p^m}[x]$ of degree $d$ over the finite field $\F_{p^m}$, one can view it as a map from $\F_{p^m}$ to $\F_{p^m}$, and examine the image of this map, also known as the value set. In this paper, we present the first non-trivial algorithm and the first complexity result on computing the cardinality of this value set. We show an elementary connection between this cardinality and the number of points on a family of varieties in affine space. We then apply Lauder and Wan's $p$-adic point-counting algorithm to count these points, resulting in a non-trivial algorithm for calculating the cardinality of the value set. The running time of our algorithm is $(pmd)^{O(d)}$. In particular, this is a polynomial time algorithm for fixed $d$ if $p$ is reasonably small. We also show that the problem is #P-hard when the polynomial is given in a sparse representation, $p=2$, and $m$ is allowed to vary, or when the polynomial is given as a straight-line program, $m=1$ and $p$ is allowed to vary. Additionally, we prove that it is NP-hard to decide whether a polynomial represented by a straight-line program has a root in a prime-order finite field, thus resolving an open problem proposed by Kaltofen and Koiran in \cite{Kaltofen03,KaltofenKo05}. △ Less

Submitted 4 November, 2011; originally announced November 2011.

MSC Class: 11Yxx

arXiv:0802.1220 [pdf, ps, other]

Complexity of Decoding Positive-Rate Reed-Solomon Codes

Authors: Qi Cheng, Daqing Wan

Abstract: The complexity of maximal likelihood decoding of the Reed-Solomon codes $[q-1, k]_q$ is a well known open problem. The only known result in this direction states that it is at least as hard as the discrete logarithm in some cases where the information rate unfortunately goes to zero. In this paper, we remove the rate restriction and prove that the same complexity result holds for any positive in… ▽ More The complexity of maximal likelihood decoding of the Reed-Solomon codes $[q-1, k]_q$ is a well known open problem. The only known result in this direction states that it is at least as hard as the discrete logarithm in some cases where the information rate unfortunately goes to zero. In this paper, we remove the rate restriction and prove that the same complexity result holds for any positive information rate. In particular, this resolves an open problem left in [4], and rules out the possibility of a polynomial time algorithm for maximal likelihood decoding problem of Reed-Solomon codes of any rate under a well known cryptographical hardness assumption. As a side result, we give an explicit construction of Hamming balls of radius bounded away from the minimum distance, which contain exponentially many codewords for Reed-Solomon code of any positive rate less than one. The previous constructions only apply to Reed-Solomon codes of diminishing rates. We also give an explicit construction of Hamming balls of relative radius less than 1 which contain subexponentially many codewords for Reed-Solomon code of rate approaching one. △ Less

Submitted 8 February, 2008; originally announced February 2008.

MSC Class: 94B05; 11T71

arXiv:0708.2456 [pdf, ps, other]

On the subset sum problem over finite fields

Authors: Jiyou Li, Daqing Wan

Abstract: The subset sum problem over finite fields is a well-known {\bf NP}-complete problem. It arises naturally from decoding generalized Reed-Solomon codes. In this paper, we study the number of solutions of the subset sum problem from a mathematical point of view. In several interesting cases, we obtain explicit or asymptotic formulas for the solution number. As a consequence, we obtain some results… ▽ More The subset sum problem over finite fields is a well-known {\bf NP}-complete problem. It arises naturally from decoding generalized Reed-Solomon codes. In this paper, we study the number of solutions of the subset sum problem from a mathematical point of view. In several interesting cases, we obtain explicit or asymptotic formulas for the solution number. As a consequence, we obtain some results on the decoding problem of Reed-Solomon codes. △ Less

Submitted 17 August, 2007; originally announced August 2007.

Comments: 16 pages

MSC Class: 11T71;94B35

arXiv:math/0405082 [pdf, ps, other]

On the List and Bounded Distance Decodibility of the Reed-Solomon Codes

Authors: Qi Cheng, Daqing Wan

Abstract: In this paper show that the list and bounded-distance decoding problems of certain bounds for the Reed-Solomon code are at least as hard as the discrete logarithm problem over finite fields. In this paper show that the list and bounded-distance decoding problems of certain bounds for the Reed-Solomon code are at least as hard as the discrete logarithm problem over finite fields. △ Less

Submitted 5 May, 2004; originally announced May 2004.

MSC Class: 11Y16; 68Q25

Showing 1–45 of 45 results for author: Wan, D