-
Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation
Authors:
Zhenglin Li,
Bo Guan,
Yuanzhou Wei,
Yiming Zhou,
Jingyu Zhang,
Jinxin Xu
Abstract:
Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail th…
▽ More
Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images, and enhanced by a tailored training regimen. The results demonstrate the model's capability to accurately render complex urban features, establishing its efficacy and potential for broad real-world applications.
△ Less
Submitted 30 April, 2024; v1 submitted 30 April, 2024;
originally announced April 2024.
-
CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code
Authors:
Batu Guan,
Yao Wan,
Zhangqian Bi,
Zheng Wang,
Hongyu Zhang,
Yulei Sui,
Pan Zhou,
Lichao Sun
Abstract:
As Large Language Models (LLMs) are increasingly used to automate code generation, it is often desired to know if the code is AI-generated and by which model, especially for purposes like protecting intellectual property (IP) in industry and preventing academic misconduct in education. Incorporating watermarks into machine-generated content is one way to provide code provenance, but existing solut…
▽ More
As Large Language Models (LLMs) are increasingly used to automate code generation, it is often desired to know if the code is AI-generated and by which model, especially for purposes like protecting intellectual property (IP) in industry and preventing academic misconduct in education. Incorporating watermarks into machine-generated content is one way to provide code provenance, but existing solutions are restricted to a single bit or lack flexibility. We present CodeIP, a new watermarking technique for LLM-based code generation. CodeIP enables the insertion of multi-bit information while preserving the semantics of the generated code, improving the strength and diversity of the inerseted watermark. This is achieved by training a type predictor to predict the subsequent grammar type of the next token to enhance the syntactical and semantic correctness of the generated code. Experiments on a real-world dataset across five programming languages showcase the effectiveness of CodeIP.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Reinforcement Learning Approach for Integrating Compressed Contexts into Knowledge Graphs
Authors:
Ngoc Quach,
Qi Wang,
Zijun Gao,
Qifeng Sun,
Bo Guan,
Lillian Floyd
Abstract:
The widespread use of knowledge graphs in various fields has brought about a challenge in effectively integrating and updating information within them. When it comes to incorporating contexts, conventional methods often rely on rules or basic machine learning models, which may not fully grasp the complexity and fluidity of context information. This research suggests an approach based on reinforcem…
▽ More
The widespread use of knowledge graphs in various fields has brought about a challenge in effectively integrating and updating information within them. When it comes to incorporating contexts, conventional methods often rely on rules or basic machine learning models, which may not fully grasp the complexity and fluidity of context information. This research suggests an approach based on reinforcement learning (RL), specifically utilizing Deep Q Networks (DQN) to enhance the process of integrating contexts into knowledge graphs. By considering the state of the knowledge graph as environment states defining actions as operations for integrating contexts and using a reward function to gauge the improvement in knowledge graph quality post-integration, this method aims to automatically develop strategies for optimal context integration. Our DQN model utilizes networks as function approximators, continually updating Q values to estimate the action value function, thus enabling effective integration of intricate and dynamic context information. Initial experimental findings show that our RL method outperforms techniques in achieving precise context integration across various standard knowledge graph datasets, highlighting the potential and effectiveness of reinforcement learning in enhancing and managing knowledge graphs.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback
Authors:
Zhangqian Bi,
Yao Wan,
Zheng Wang,
Hongyu Zhang,
Batu Guan,
Fangxin Lu,
Zili Zhang,
Yulei Sui,
Hai Jin,
Xuanhua Shi
Abstract:
Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. We present CoCoGen, a new code…
▽ More
Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. We present CoCoGen, a new code generation approach that uses compiler feedback to improve the LLM-generated code. CoCoGen first leverages static analysis to identify mismatches between the generated code and the project's context. It then iteratively aligns and fixes the identified errors using information extracted from the code repository. We integrate CoCoGen with two representative LLMs, i.e., GPT-3.5-Turbo and Code Llama (13B), and apply it to Python code generation. Experimental results show that CoCoGen significantly improves the vanilla LLMs by over 80% in generating code dependent on the project context and consistently outperforms the existing retrieval-based code generation baselines.
△ Less
Submitted 10 June, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
CodeS: Natural Language to Code Repository via Multi-Layer Sketch
Authors:
Daoguang Zan,
Ailun Yu,
Wei Liu,
Dong Chen,
Bo Shen,
Wei Li,
Yafen Yao,
Yongshun Gong,
Xiaolin Chen,
Bei Guan,
Zhiguang Yang,
Yongji Wang,
Qianxiang Wang,
Lizhen Cui
Abstract:
The impressive performance of large language models (LLMs) on code-related tasks has shown the potential of fully automated software development. In light of this, we introduce a new software engineering task, namely Natural Language to code Repository (NL2Repo). This task aims to generate an entire code repository from its natural language requirements. To address this task, we propose a simple y…
▽ More
The impressive performance of large language models (LLMs) on code-related tasks has shown the potential of fully automated software development. In light of this, we introduce a new software engineering task, namely Natural Language to code Repository (NL2Repo). This task aims to generate an entire code repository from its natural language requirements. To address this task, we propose a simple yet effective framework CodeS, which decomposes NL2Repo into multiple sub-tasks by a multi-layer sketch. Specifically, CodeS includes three modules: RepoSketcher, FileSketcher, and SketchFiller. RepoSketcher first generates a repository's directory structure for given requirements; FileSketcher then generates a file sketch for each file in the generated structure; SketchFiller finally fills in the details for each function in the generated file sketch. To rigorously assess CodeS on the NL2Repo task, we carry out evaluations through both automated benchmarking and manual feedback analysis. For benchmark-based evaluation, we craft a repository-oriented benchmark, SketchEval, and design an evaluation metric, SketchBLEU. For feedback-based evaluation, we develop a VSCode plugin for CodeS and engage 30 participants in conducting empirical studies. Extensive experiments prove the effectiveness and practicality of CodeS on the NL2Repo task.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Six-Point Method for Multi-Camera Systems with Reduced Solution Space
Authors:
Banglei Guan,
Ji Zhao,
Laurent Kneip
Abstract:
Relative pose estimation using point correspondences (PC) is a widely used technique. A minimal configuration of six PCs is required for generalized cameras. In this paper, we present several minimal solvers that use six PCs to compute the 6DOF relative pose of a multi-camera system, including a minimal solver for the generalized camera and two minimal solvers for the practical configuration of tw…
▽ More
Relative pose estimation using point correspondences (PC) is a widely used technique. A minimal configuration of six PCs is required for generalized cameras. In this paper, we present several minimal solvers that use six PCs to compute the 6DOF relative pose of a multi-camera system, including a minimal solver for the generalized camera and two minimal solvers for the practical configuration of two-camera rigs. The equation construction is based on the decoupling of rotation and translation. Rotation is represented by Cayley or quaternion parametrization, and translation can be eliminated by using the hidden variable technique. Ray bundle constraints are found and proven when a subset of PCs relate the same cameras across two views. This is the key to reducing the number of solutions and generating numerically stable solvers. Moreover, all configurations of six-point problems for multi-camera systems are enumerated. Extensive experiments demonstrate that our solvers are more accurate than the state-of-the-art six-point methods, while achieving better performance in efficiency.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Improving Natural Language Capability of Code Large Language Model
Authors:
Wei Li,
Daoguang Zan,
Bei Guan,
Ailun Yu,
Xiaolin Chen,
Yongji Wang
Abstract:
Code large language models (Code LLMs) have demonstrated remarkable performance in code generation. Nonetheless, most existing works focus on boosting code LLMs from the perspective of programming capabilities, while their natural language capabilities receive less attention. To fill this gap, we thus propose a novel framework, comprising two modules: AttentionExtractor, which is responsible for e…
▽ More
Code large language models (Code LLMs) have demonstrated remarkable performance in code generation. Nonetheless, most existing works focus on boosting code LLMs from the perspective of programming capabilities, while their natural language capabilities receive less attention. To fill this gap, we thus propose a novel framework, comprising two modules: AttentionExtractor, which is responsible for extracting key phrases from the user's natural language requirements, and AttentionCoder, which leverages these extracted phrases to generate target code to solve the requirement. This framework pioneers an innovative idea by seamlessly integrating code LLMs with traditional natural language processing tools. To validate the effectiveness of the framework, we craft a new code generation benchmark, called MultiNL-H, covering five natural languages. Extensive experimental results demonstrate the effectiveness of our proposed framework.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
PA-SAM: Prompt Adapter SAM for High-Quality Image Segmentation
Authors:
Zhaozhi Xie,
Bochen Guan,
Weihao Jiang,
Muyang Yi,
Yue Ding,
Hongtao Lu,
Lei Zhang
Abstract:
The Segment Anything Model (SAM) has exhibited outstanding performance in various image segmentation tasks. Despite being trained with over a billion masks, SAM faces challenges in mask prediction quality in numerous scenarios, especially in real-world contexts. In this paper, we introduce a novel prompt-driven adapter into SAM, namely Prompt Adapter Segment Anything Model (PA-SAM), aiming to enha…
▽ More
The Segment Anything Model (SAM) has exhibited outstanding performance in various image segmentation tasks. Despite being trained with over a billion masks, SAM faces challenges in mask prediction quality in numerous scenarios, especially in real-world contexts. In this paper, we introduce a novel prompt-driven adapter into SAM, namely Prompt Adapter Segment Anything Model (PA-SAM), aiming to enhance the segmentation mask quality of the original SAM. By exclusively training the prompt adapter, PA-SAM extracts detailed information from images and optimizes the mask decoder feature at both sparse and dense prompt levels, improving the segmentation performance of SAM to produce high-quality masks. Experimental results demonstrate that our PA-SAM outperforms other SAM-based methods in high-quality, zero-shot, and open-set segmentation. We're making the source code and models available at https://github.com/xzz2/pa-sam.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
A GAN-based data poisoning framework against anomaly detection in vertical federated learning
Authors:
Xiaolin Chen,
Daoguang Zan,
Wei Li,
Bei Guan,
Yongji Wang
Abstract:
In vertical federated learning (VFL), commercial entities collaboratively train a model while preserving data privacy. However, a malicious participant's poisoning attack may degrade the performance of this collaborative model. The main challenge in achieving the poisoning attack is the absence of access to the server-side top model, leaving the malicious participant without a clear target model.…
▽ More
In vertical federated learning (VFL), commercial entities collaboratively train a model while preserving data privacy. However, a malicious participant's poisoning attack may degrade the performance of this collaborative model. The main challenge in achieving the poisoning attack is the absence of access to the server-side top model, leaving the malicious participant without a clear target model. To address this challenge, we introduce an innovative end-to-end poisoning framework P-GAN. Specifically, the malicious participant initially employs semi-supervised learning to train a surrogate target model. Subsequently, this participant employs a GAN-based method to produce adversarial perturbations to degrade the surrogate target model's performance. Finally, the generator is obtained and tailored for VFL poisoning. Besides, we develop an anomaly detection algorithm based on a deep auto-encoder (DAE), offering a robust defense mechanism to VFL scenarios. Through extensive experiments, we evaluate the efficacy of P-GAN and DAE, and further analyze the factors that influence their performance.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
Authors:
Sicheng Mo,
Fangzhou Mu,
Kuan Heng Lin,
Yanli Liu,
Bochen Guan,
Yin Li,
Bolei Zhou
Abstract:
Recent approaches such as ControlNet offer users fine-grained spatial control over text-to-image (T2I) diffusion models. However, auxiliary modules have to be trained for each type of spatial condition, model architecture, and checkpoint, putting them at odds with the diverse intents and preferences a human designer would like to convey to the AI models during the content creation process. In this…
▽ More
Recent approaches such as ControlNet offer users fine-grained spatial control over text-to-image (T2I) diffusion models. However, auxiliary modules have to be trained for each type of spatial condition, model architecture, and checkpoint, putting them at odds with the diverse intents and preferences a human designer would like to convey to the AI models during the content creation process. In this work, we present FreeControl, a training-free approach for controllable T2I generation that supports multiple conditions, architectures, and checkpoints simultaneously. FreeControl designs structure guidance to facilitate the structure alignment with a guidance image, and appearance guidance to enable the appearance sharing between images generated using the same seed. Extensive qualitative and quantitative experiments demonstrate the superior performance of FreeControl across a variety of pre-trained T2I models. In particular, FreeControl facilitates convenient training-free control over many different architectures and checkpoints, allows the challenging input conditions on which most of the existing training-free methods fail, and achieves competitive synthesis quality with training-based approaches.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Authors:
Liang Shang,
Yanli Liu,
Zhengyang Lou,
Shuxue Quan,
Nagesh Adluru,
Bochen Guan,
William A. Sethares
Abstract:
Convolutional neural networks (CNNs) and vision transformers (ViTs) have achieved remarkable success in various vision tasks. However, many architectures do not consider interactions between feature maps from different stages and scales, which may limit their performance. In this work, we propose a simple add-on attention module to overcome these limitations via multi-stage and cross-scale interac…
▽ More
Convolutional neural networks (CNNs) and vision transformers (ViTs) have achieved remarkable success in various vision tasks. However, many architectures do not consider interactions between feature maps from different stages and scales, which may limit their performance. In this work, we propose a simple add-on attention module to overcome these limitations via multi-stage and cross-scale interactions. Specifically, the proposed Multi-Stage Cross-Scale Attention (MSCSA) module takes feature maps from different stages to enable multi-stage interactions and achieves cross-scale interactions by computing self-attention at different scales based on the multi-stage feature maps. Our experiments on several downstream tasks show that MSCSA provides a significant performance boost with modest additional FLOPs and runtime.
△ Less
Submitted 14 August, 2023; v1 submitted 10 August, 2023;
originally announced August 2023.
-
Private-Library-Oriented Code Generation with Large Language Models
Authors:
Daoguang Zan,
Bei Chen,
Yongshun Gong,
Junzhi Cao,
Fengji Zhang,
Bingchao Wu,
Bei Guan,
Yilong Yin,
Yongji Wang
Abstract:
Large language models (LLMs), such as Codex and GPT-4, have recently showcased their remarkable code generation abilities, facilitating a significant boost in coding efficiency. This paper will delve into utilizing LLMs for code generation in private libraries, as they are widely employed in everyday programming. Despite their remarkable capabilities, generating such private APIs poses a formidabl…
▽ More
Large language models (LLMs), such as Codex and GPT-4, have recently showcased their remarkable code generation abilities, facilitating a significant boost in coding efficiency. This paper will delve into utilizing LLMs for code generation in private libraries, as they are widely employed in everyday programming. Despite their remarkable capabilities, generating such private APIs poses a formidable conundrum for LLMs, as they inherently lack exposure to these private libraries during pre-training. To address this challenge, we propose a novel framework that emulates the process of programmers writing private code. This framework comprises two modules: APIFinder first retrieves potentially useful APIs from API documentation; and APICoder then leverages these retrieved APIs to generate private code. Specifically, APIFinder employs vector retrieval techniques and allows user involvement in the retrieval process. For APICoder, it can directly utilize off-the-shelf code generation models. To further cultivate explicit proficiency in invoking APIs from prompts, we continuously pre-train a reinforced version of APICoder, named CodeGenAPI. Our goal is to train the above two modules on vast public libraries, enabling generalization to private ones. Meanwhile, we create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval, and meticulously handcraft test cases for each benchmark to support comprehensive evaluations. Numerous experiments on the four benchmarks consistently affirm the effectiveness of our approach. Furthermore, deeper analysis is also conducted to glean additional insights.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Affine Correspondences between Multi-Camera Systems for Relative Pose Estimation
Authors:
Banglei Guan,
Ji Zhao
Abstract:
We present a novel method to compute the relative pose of multi-camera systems using two affine correspondences (ACs). Existing solutions to the multi-camera relative pose estimation are either restricted to special cases of motion, have too high computational complexity, or require too many point correspondences (PCs). Thus, these solvers impede an efficient or accurate relative pose estimation w…
▽ More
We present a novel method to compute the relative pose of multi-camera systems using two affine correspondences (ACs). Existing solutions to the multi-camera relative pose estimation are either restricted to special cases of motion, have too high computational complexity, or require too many point correspondences (PCs). Thus, these solvers impede an efficient or accurate relative pose estimation when applying RANSAC as a robust estimator. This paper shows that the 6DOF relative pose estimation problem using ACs permits a feasible minimal solution, when exploiting the geometric constraints between ACs and multi-camera systems using a special parameterization. We present a problem formulation based on two ACs that encompass two common types of ACs across two views, i.e., inter-camera and intra-camera. Moreover, the framework for generating the minimal solvers can be extended to solve various relative pose estimation problems, e.g., 5DOF relative pose estimation with known rotation angle prior. Experiments on both virtual and real multi-camera systems prove that the proposed solvers are more efficient than the state-of-the-art algorithms, while resulting in a better relative pose accuracy. Source code is available at https://github.com/jizhaox/relpose-mcs-depth.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
ChatGPT as a mapping assistant: A novel method to enrich maps with generative AI and content derived from street-level photographs
Authors:
Levente Juhász,
Peter Mooney,
Hartwig H. Hochmair,
Boyuan Guan
Abstract:
This paper explores the concept of leveraging generative AI as a mapping assistant for enhancing the efficiency of collaborative mapping. We present results of an experiment that combines multiple sources of volunteered geographic information (VGI) and large language models (LLMs). Three analysts described the content of crowdsourced Mapillary street-level photographs taken along roads in a small…
▽ More
This paper explores the concept of leveraging generative AI as a mapping assistant for enhancing the efficiency of collaborative mapping. We present results of an experiment that combines multiple sources of volunteered geographic information (VGI) and large language models (LLMs). Three analysts described the content of crowdsourced Mapillary street-level photographs taken along roads in a small test area in Miami, Florida. GPT-3.5-turbo was instructed to suggest the most appropriate tagging for each road in OpenStreetMap (OSM). The study also explores the utilization of BLIP-2, a state-of-the-art multimodal pre-training method as an artificial analyst of street-level photographs in addition to human analysts. Results demonstrate two ways to effectively increase the accuracy of mapping suggestions without modifying the underlying AI models: by (1) providing a more detailed description of source photographs, and (2) combining prompt engineering with additional context (e.g. location and objects detected along a road). The first approach increases the suggestion accuracy by up to 29%, and the second one by up to 20%.
△ Less
Submitted 15 March, 2024; v1 submitted 5 June, 2023;
originally announced June 2023.
-
SimHaze: game engine simulated data for real-world dehazing
Authors:
Zhengyang Lou,
Huan Xu,
Fangzhou Mu,
Yanli Liu,
Xiaoyu Zhang,
Liang Shang,
Jiang Li,
Bochen Guan,
Yin Li,
Yu Hen Hu
Abstract:
Deep models have demonstrated recent success in single-image dehazing. Most prior methods consider fully supervised training and learn from paired clean and hazy images, where a hazy image is synthesized based on a clean image and its estimated depth map. This paradigm, however, can produce low-quality hazy images due to inaccurate depth estimation, resulting in poor generalization of the trained…
▽ More
Deep models have demonstrated recent success in single-image dehazing. Most prior methods consider fully supervised training and learn from paired clean and hazy images, where a hazy image is synthesized based on a clean image and its estimated depth map. This paradigm, however, can produce low-quality hazy images due to inaccurate depth estimation, resulting in poor generalization of the trained models. In this paper, we explore an alternative approach for generating paired clean-hazy images by leveraging computer graphics. Using a modern game engine, our approach renders crisp clean images and their precise depth maps, based on which high-quality hazy images can be synthesized for training dehazing models. To this end, we present SimHaze: a new synthetic haze dataset. More importantly, we show that training with SimHaze alone allows the latest dehazing models to achieve significantly better performance in comparison to previous dehazing datasets. Our dataset and code will be made publicly available.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Hierarchical and Contrastive Representation Learning for Knowledge-aware Recommendation
Authors:
Bingchao Wu,
Yangyuxuan Kang,
Daoguang Zan,
Bei Guan,
Yongji Wang
Abstract:
Incorporating knowledge graph into recommendation is an effective way to alleviate data sparsity. Most existing knowledge-aware methods usually perform recursive embedding propagation by enumerating graph neighbors. However, the number of nodes' neighbors grows exponentially as the hop number increases, forcing the nodes to be aware of vast neighbors under this recursive propagation for distilling…
▽ More
Incorporating knowledge graph into recommendation is an effective way to alleviate data sparsity. Most existing knowledge-aware methods usually perform recursive embedding propagation by enumerating graph neighbors. However, the number of nodes' neighbors grows exponentially as the hop number increases, forcing the nodes to be aware of vast neighbors under this recursive propagation for distilling the high-order semantic relatedness. This may induce more harmful noise than useful information into recommendation, leading the learned node representations to be indistinguishable from each other, that is, the well-known over-smoothing issue. To relieve this issue, we propose a Hierarchical and CONtrastive representation learning framework for knowledge-aware recommendation named HiCON. Specifically, for avoiding the exponential expansion of neighbors, we propose a hierarchical message aggregation mechanism to interact separately with low-order neighbors and meta-path-constrained high-order neighbors. Moreover, we also perform cross-order contrastive learning to enforce the representations to be more discriminative. Extensive experiments on three datasets show the remarkable superiority of HiCON over state-of-the-art approaches.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
A Review on the effectiveness of Dimensional Reduction with Computational Forensics: An Application on Malware Analysis
Authors:
Aye Thaw Da Naing,
Justin Soh Beng Guan,
Yarzar Shwe Win,
Jonathan Pan
Abstract:
The Android operating system is pervasively adopted as the operating system platform of choice for smart devices. However, the strong adoption has also resulted in exponential growth in the number of Android based malicious software or malware. To deal with such cyber threats as part of cyber investigation and digital forensics, computational techniques in the form of machine learning algorithms a…
▽ More
The Android operating system is pervasively adopted as the operating system platform of choice for smart devices. However, the strong adoption has also resulted in exponential growth in the number of Android based malicious software or malware. To deal with such cyber threats as part of cyber investigation and digital forensics, computational techniques in the form of machine learning algorithms are applied for such malware identification, detection and forensics analysis. However, such Computational Forensics modelling techniques are constrained the volume, velocity, variety and veracity of the malware landscape. This in turn would affect its identification and detection effectiveness. Such consequence would inherently induce the question of sustainability with such solution approach. One approach to optimise effectiveness is to apply dimensional reduction techniques like Principal Component Analysis with the intent to enhance algorithmic performance. In this paper, we evaluate the effectiveness of the application of Principle Component Analysis on Computational Forensics task of detecting Android based malware. We applied our research hypothesis to three different datasets with different machine learning algorithms. Our research result showed that the dimensionally reduced dataset would result in a measure of degradation in accuracy performance.
△ Less
Submitted 15 January, 2023;
originally announced January 2023.
-
Large Language Models Meet NL2Code: A Survey
Authors:
Daoguang Zan,
Bei Chen,
Fengji Zhang,
Dianjie Lu,
Bingchao Wu,
Bei Guan,
Yongji Wang,
Jian-Guang Lou
Abstract:
The task of generating code from a natural language description, or NL2Code, is considered a pressing and significant challenge in code intelligence. Thanks to the rapid development of pre-training techniques, surging large language models are being proposed for code, sparking the advances in NL2Code. To facilitate further research and applications in this field, in this paper, we present a compre…
▽ More
The task of generating code from a natural language description, or NL2Code, is considered a pressing and significant challenge in code intelligence. Thanks to the rapid development of pre-training techniques, surging large language models are being proposed for code, sparking the advances in NL2Code. To facilitate further research and applications in this field, in this paper, we present a comprehensive survey of 27 existing large language models for NL2Code, and also review benchmarks and metrics. We provide an intuitive comparison of all existing models on the HumanEval benchmark. Through in-depth observation and analysis, we provide some insights and conclude that the key factors contributing to the success of large language models for NL2Code are "Large Size, Premium Data, Expert Tuning". In addition, we discuss challenges and opportunities regarding the gap between models and humans. We also create a website https://nl2code.github.io to track the latest progress through crowd-sourcing. To the best of our knowledge, this is the first survey of large language models for NL2Code, and we believe it will contribute to the ongoing development of the field.
△ Less
Submitted 8 May, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
Generative Modeling in Sinogram Domain for Sparse-view CT Reconstruction
Authors:
Bing Guan,
Cailian Yang,
Liu Zhang,
Shanzhou Niu,
Minghui Zhang,
Yuhao Wang,
Weiwen Wu,
Qiegen Liu
Abstract:
The radiation dose in computed tomography (CT) examinations is harmful for patients but can be significantly reduced by intuitively decreasing the number of projection views. Reducing projection views usually leads to severe aliasing artifacts in reconstructed images. Previous deep learning (DL) techniques with sparse-view data require sparse-view/full-view CT image pairs to train the network with…
▽ More
The radiation dose in computed tomography (CT) examinations is harmful for patients but can be significantly reduced by intuitively decreasing the number of projection views. Reducing projection views usually leads to severe aliasing artifacts in reconstructed images. Previous deep learning (DL) techniques with sparse-view data require sparse-view/full-view CT image pairs to train the network with supervised manners. When the number of projection view changes, the DL network should be retrained with updated sparse-view/full-view CT image pairs. To relieve this limitation, we present a fully unsupervised score-based generative model in sinogram domain for sparse-view CT reconstruction. Specifically, we first train a score-based generative model on full-view sinogram data and use multi-channel strategy to form highdimensional tensor as the network input to capture their prior distribution. Then, at the inference stage, the stochastic differential equation (SDE) solver and data-consistency step were performed iteratively to achieve fullview projection. Filtered back-projection (FBP) algorithm was used to achieve the final image reconstruction. Qualitative and quantitative studies were implemented to evaluate the presented method with several CT data. Experimental results demonstrated that our method achieved comparable or better performance than the supervised learning counterparts.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
When Language Model Meets Private Library
Authors:
Daoguang Zan,
Bei Chen,
Zeqi Lin,
Bei Guan,
Yongji Wang,
Jian-Guang Lou
Abstract:
With the rapid development of pre-training techniques, a number of language models have been pre-trained on large-scale code corpora and perform well in code generation. In this paper, we investigate how to equip pre-trained language models with the ability of code generation for private libraries. In practice, it is common for programmers to write code using private libraries. However, this is a…
▽ More
With the rapid development of pre-training techniques, a number of language models have been pre-trained on large-scale code corpora and perform well in code generation. In this paper, we investigate how to equip pre-trained language models with the ability of code generation for private libraries. In practice, it is common for programmers to write code using private libraries. However, this is a challenge for language models since they have never seen private APIs during training. Motivated by the fact that private libraries usually come with elaborate API documentation, we propose a novel framework with two modules: the APIRetriever finds useful APIs, and then the APICoder generates code using these APIs. For APIRetriever, we present a dense retrieval system and also design a friendly interaction to involve uses. For APICoder, we can directly use off-the-shelf language models, or continually pre-train the base model on a code corpus containing API information. Both modules are trained with data from public libraries and can be generalized to private ones. Furthermore, we craft three benchmarks for private libraries, named TorchDataEval, MonkeyEval, and BeatNumEval. Experimental results demonstrate the impressive performance of our framework.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
Semi-supervised object detection based on single-stage detector for thighbone fracture localization
Authors:
Jinman Wei,
Jinkun Yao,
Guoshan Zhanga,
Bin Guan,
Yueming Zhang,
Shaoquan Wang
Abstract:
The thighbone is the largest bone supporting the lower body. If the thighbone fracture is not treated in time, it will lead to lifelong inability to walk. Correct diagnosis of thighbone disease is very important in orthopedic medicine. Deep learning is promoting the development of fracture detection technology. However, the existing computer aided diagnosis (CAD) methods baesd on deep learning rel…
▽ More
The thighbone is the largest bone supporting the lower body. If the thighbone fracture is not treated in time, it will lead to lifelong inability to walk. Correct diagnosis of thighbone disease is very important in orthopedic medicine. Deep learning is promoting the development of fracture detection technology. However, the existing computer aided diagnosis (CAD) methods baesd on deep learning rely on a large number of manually labeled data, and labeling these data costs a lot of time and energy. Therefore, we develop a object detection method with limited labeled image quantity and apply it to the thighbone fracture localization. In this work, we build a semi-supervised object detection(SSOD) framework based on single-stage detector, which including three modules: adaptive difficult sample oriented (ADSO) module, Fusion Box and deformable expand encoder (Dex encoder). ADSO module takes the classification score as the label reliability evaluation criterion by weighting, Fusion Box is designed to merge similar pseudo boxes into a reliable box for box regression and Dex encoder is proposed to enhance the adaptability of image augmentation. The experiment is conducted on the thighbone fracture dataset, which includes 3484 training thigh fracture images and 358 testing thigh fracture images. The experimental results show that the proposed method achieves the state-of-the-art AP in thighbone fracture detection at different labeled data rates, i.e. 1%, 5% and 10%. Besides, we use full data to achieve knowledge distillation, our method achieves 86.2% AP50 and 52.6% AP75.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
CERT: Continual Pre-Training on Sketches for Library-Oriented Code Generation
Authors:
Daoguang Zan,
Bei Chen,
Dejian Yang,
Zeqi Lin,
Minsu Kim,
Bei Guan,
Yongji Wang,
Weizhu Chen,
Jian-Guang Lou
Abstract:
Code generation is a longstanding challenge, aiming to generate a code snippet based on a natural language description. Usually, expensive text-code paired data is essential for training a code generation model. Recently, thanks to the success of pre-training techniques, large language models are trained on large-scale unlabelled code corpora and perform well in code generation. In this paper, we…
▽ More
Code generation is a longstanding challenge, aiming to generate a code snippet based on a natural language description. Usually, expensive text-code paired data is essential for training a code generation model. Recently, thanks to the success of pre-training techniques, large language models are trained on large-scale unlabelled code corpora and perform well in code generation. In this paper, we investigate how to leverage an unlabelled code corpus to train a model for library-oriented code generation. Since it is a common practice for programmers to reuse third-party libraries, in which case the text-code paired data are harder to obtain due to the huge number of libraries. We observe that library-oriented code snippets are more likely to share similar code sketches. Hence, we present CERT with two steps: a sketcher generates the sketch, then a generator fills the details in the sketch. Both the sketcher and the generator are continually pre-trained upon a base model using unlabelled data. Furthermore, we craft two benchmarks named PandasEval and NumpyEval to evaluate library-oriented code generation. Experimental results demonstrate the impressive performance of CERT. For example, it surpasses the base model by an absolute 15.67% improvement in terms of pass@1 on PandasEval. Our work is available at https://github.com/microsoft/PyCodeGPT.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Universal Generative Modeling for Calibration-free Parallel Mr Imaging
Authors:
Wanqing Zhu,
Bing Guan,
Shanshan Wang,
Minghui Zhang,
Qiegen Liu
Abstract:
The integration of compressed sensing and parallel imaging (CS-PI) provides a robust mechanism for accelerating MRI acquisitions. However, most such strategies require the explicit formation of either coil sensitivity profiles or a cross-coil correlation operator, and as a result reconstruction corresponds to solving a challenging bilinear optimization problem. In this work, we present an unsuperv…
▽ More
The integration of compressed sensing and parallel imaging (CS-PI) provides a robust mechanism for accelerating MRI acquisitions. However, most such strategies require the explicit formation of either coil sensitivity profiles or a cross-coil correlation operator, and as a result reconstruction corresponds to solving a challenging bilinear optimization problem. In this work, we present an unsupervised deep learning framework for calibration-free parallel MRI, coined universal generative modeling for parallel imaging (UGM-PI). More precisely, we make use of the merits of both wavelet transform and the adaptive iteration strategy in a unified framework. We train a powerful noise conditional score network by forming wavelet tensor as the network input at the training phase. Experimental results on both physical phantom and in vivo datasets implied that the proposed method is comparable and even superior to state-of-the-art CS-PI reconstruction approaches.
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning
Authors:
Yanli Liu,
Bochen Guan,
Qinwen Xu,
Weiyi Li,
Shuxue Quan
Abstract:
For many years, the family of convolutional neural networks (CNNs) has been a workhorse in deep learning. Recently, many novel CNN structures have been designed to address increasingly challenging tasks. To make them work efficiently on edge devices, researchers have proposed various structured network pruning strategies to reduce their memory and computational cost. However, most of them only foc…
▽ More
For many years, the family of convolutional neural networks (CNNs) has been a workhorse in deep learning. Recently, many novel CNN structures have been designed to address increasingly challenging tasks. To make them work efficiently on edge devices, researchers have proposed various structured network pruning strategies to reduce their memory and computational cost. However, most of them only focus on reducing the number of filter channels per layer without considering the redundancy within individual filter channels. In this work, we explore pruning from another dimension, the kernel size. We develop a CNN pruning framework called SMOF, which Squeezes More Out of Filters by reducing both kernel size and the number of filter channels. Notably, SMOF is friendly to standard hardware devices without any customized low-level implementations, and the pruning effort by kernel size reduction does not suffer from the fixed-size width constraint in SIMD units of general-purpose processors. The pruned networks can be deployed effortlessly with significant running time reduction. We also support these claims via extensive experiments on various CNN structures and general-purpose processors for mobile devices.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Superstring-Based Sequence Obfuscation to Thwart Pattern Matching Attacks
Authors:
Bo Guan,
Nazanin Takbiri,
Dennis Goeckel,
Amir Houmansadr,
Hossein Pishro-Nik
Abstract:
User privacy can be compromised by matching user data traces to records of their previous behavior. The matching of the statistical characteristics of traces to prior user behavior has been widely studied. However, an adversary can also identify a user deterministically by searching data traces for a pattern that is unique to that user. Our goal is to thwart such an adversary by applying small art…
▽ More
User privacy can be compromised by matching user data traces to records of their previous behavior. The matching of the statistical characteristics of traces to prior user behavior has been widely studied. However, an adversary can also identify a user deterministically by searching data traces for a pattern that is unique to that user. Our goal is to thwart such an adversary by applying small artificial distortions to data traces such that each potentially identifying pattern is shared by a large number of users. Importantly, in contrast to statistical approaches, we develop data-independent algorithms that require no assumptions on the model by which the traces are generated. By relating the problem to a set of combinatorial questions on sequence construction, we are able to provide provable guarantees for our proposed constructions. We also introduce data-dependent approaches for the same problem. The algorithms are evaluated on synthetic data traces and on the Reality Mining Dataset to demonstrate their utility.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Fed-EINI: An Efficient and Interpretable Inference Framework for Decision Tree Ensembles in Federated Learning
Authors:
Xiaolin Chen,
Shuai Zhou,
Bei guan,
Kai Yang,
Hao Fan,
Hu Wang,
Yongji Wang
Abstract:
The increasing concerns about data privacy and security drive an emerging field of studying privacy-preserving machine learning from isolated data sources, i.e., federated learning. A class of federated learning, vertical federated learning, where different parties hold different features for common users, has a great potential of driving a great variety of business cooperation among enterprises i…
▽ More
The increasing concerns about data privacy and security drive an emerging field of studying privacy-preserving machine learning from isolated data sources, i.e., federated learning. A class of federated learning, vertical federated learning, where different parties hold different features for common users, has a great potential of driving a great variety of business cooperation among enterprises in many fields. In machine learning, decision tree ensembles such as gradient boosting decision trees (GBDT) and random forest are widely applied powerful models with high interpretability and modeling efficiency. However, stateof-art vertical federated learning frameworks adapt anonymous features to avoid possible data breaches, makes the interpretability of the model compromised. To address this issue in the inference process, in this paper, we firstly make a problem analysis about the necessity of disclosure meanings of feature to Guest Party in vertical federated learning. Then we find the prediction result of a tree could be expressed as the intersection of results of sub-models of the tree held by all parties. With this key observation, we protect data privacy and allow the disclosure of feature meaning by concealing decision paths and adapt a communication-efficient secure computation method for inference outputs. The advantages of Fed-EINI will be demonstrated through both theoretical analysis and extensive numerical results. We improve the interpretability of the model by disclosing the meaning of features while ensuring efficiency and accuracy.
△ Less
Submitted 7 December, 2021; v1 submitted 20 May, 2021;
originally announced May 2021.
-
On Relative Pose Recovery for Multi-Camera Systems
Authors:
Ji Zhao,
Banglei Guan
Abstract:
The point correspondence (PC) and affine correspondence (AC) are widely used for relative pose estimation. An AC consists of a PC across two views and an affine transformation between the small patches around this PC. Previous work demonstrates that one AC generally provides three independent constraints for relative pose estimation. For multi-camera systems, there is still not any AC-based minima…
▽ More
The point correspondence (PC) and affine correspondence (AC) are widely used for relative pose estimation. An AC consists of a PC across two views and an affine transformation between the small patches around this PC. Previous work demonstrates that one AC generally provides three independent constraints for relative pose estimation. For multi-camera systems, there is still not any AC-based minimal solver for general relative pose estimation. To deal with this problem, we propose a complete solution to relative pose estimation from two ACs for multi-camera systems, consisting of a series of minimal solvers. The solver generation in our solution is based on Cayley or quaternion parameterization for rotation and hidden variable technique to eliminate translation. This solver generation method is also naturally applied to relative pose estimation from PCs, resulting in a new six-point method for multi-camera systems. A few extensions are made, including relative pose estimation with known rotation angle and/or with unknown focal lengths. Extensive experiments demonstrate that the proposed AC-based solvers and PC-based solvers are effective and efficient on synthetic and real-world datasets.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Minimal Cases for Computing the Generalized Relative Pose using Affine Correspondences
Authors:
Banglei Guan,
Ji Zhao,
Daniel Barath,
Friedrich Fraundorfer
Abstract:
We propose three novel solvers for estimating the relative pose of a multi-camera system from affine correspondences (ACs). A new constraint is derived interpreting the relationship of ACs and the generalized camera model. Using the constraint, we demonstrate efficient solvers for two types of motions assumed. Considering that the cameras undergo planar motion, we propose a minimal solution using…
▽ More
We propose three novel solvers for estimating the relative pose of a multi-camera system from affine correspondences (ACs). A new constraint is derived interpreting the relationship of ACs and the generalized camera model. Using the constraint, we demonstrate efficient solvers for two types of motions assumed. Considering that the cameras undergo planar motion, we propose a minimal solution using a single AC and a solver with two ACs to overcome the degenerate case. Also, we propose a minimal solution using two ACs with known vertical direction, e.g., from an IMU. Since the proposed methods require significantly fewer correspondences than state-of-the-art algorithms, they can be efficiently used within RANSAC for outlier removal and initial motion estimation. The solvers are tested both on synthetic data and on real-world scenes from the KITTI odometry benchmark. It is shown that the accuracy of the estimated poses is superior to the state-of-the-art techniques.
△ Less
Submitted 19 August, 2021; v1 submitted 21 July, 2020;
originally announced July 2020.
-
Minimal Solutions for Relative Pose with a Single Affine Correspondence
Authors:
Banglei Guan,
Ji Zhao,
Zhang Li,
Fang Sun,
Friedrich Fraundorfer
Abstract:
In this paper we present four cases of minimal solutions for two-view relative pose estimation by exploiting the affine transformation between feature points and we demonstrate efficient solvers for these cases. It is shown, that under the planar motion assumption or with knowledge of a vertical direction, a single affine correspondence is sufficient to recover the relative camera pose. The four c…
▽ More
In this paper we present four cases of minimal solutions for two-view relative pose estimation by exploiting the affine transformation between feature points and we demonstrate efficient solvers for these cases. It is shown, that under the planar motion assumption or with knowledge of a vertical direction, a single affine correspondence is sufficient to recover the relative camera pose. The four cases considered are two-view planar relative motion for calibrated cameras as a closed-form and a least-squares solution, a closed-form solution for unknown focal length and the case of a known vertical direction. These algorithms can be used efficiently for outlier detection within a RANSAC loop and for initial motion estimation. All the methods are evaluated on both synthetic data and real-world datasets from the KITTI benchmark. The experimental results demonstrate that our methods outperform comparable state-of-the-art methods in accuracy with the benefit of a reduced number of needed RANSAC iterations.
△ Less
Submitted 3 April, 2020; v1 submitted 23 December, 2019;
originally announced December 2019.
-
SpecNet: Spectral Domain Convolutional Neural Network
Authors:
Bochen Guan,
Jinnian Zhang,
William A. Sethares,
Richard Kijowski,
Fang Liu
Abstract:
The memory consumption of most Convolutional Neural Network (CNN) architectures grows rapidly with increasing depth of the network, which is a major constraint for efficient network training on modern GPUs with limited memory, embedded systems, and mobile devices. Several studies show that the feature maps (as generated after the convolutional layers) are the main bottleneck in this memory problem…
▽ More
The memory consumption of most Convolutional Neural Network (CNN) architectures grows rapidly with increasing depth of the network, which is a major constraint for efficient network training on modern GPUs with limited memory, embedded systems, and mobile devices. Several studies show that the feature maps (as generated after the convolutional layers) are the main bottleneck in this memory problem. Often, these feature maps mimic natural photographs in the sense that their energy is concentrated in the spectral domain. Although embedding CNN architectures in the spectral domain is widely exploited to accelerate the training process, we demonstrate that it is also possible to use the spectral domain to reduce the memory footprint, a method we call Spectral Domain Convolutional Neural Network (SpecNet) that performs both the convolution and the activation operations in the spectral domain. The performance of SpecNet is evaluated on three competitive object recognition benchmark tasks (CIFAR-10, SVHN, and ImageNet), and compared with several state-of-the-art implementations. Overall, SpecNet is able to reduce memory consumption by about 60% without significant loss of performance for all tested networks.
△ Less
Submitted 8 February, 2021; v1 submitted 26 May, 2019;
originally announced May 2019.
-
Video Logo Retrieval based on local Features
Authors:
Bochen Guan,
Hanrong Ye,
Hong Liu,
William A. Sethares
Abstract:
Estimation of the frequency and duration of logos in videos is important and challenging in the advertisement industry as a way of estimating the impact of ad purchases. Since logos occupy only a small area in the videos, the popular methods of image retrieval could fail. This paper develops an algorithm called Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm based on the…
▽ More
Estimation of the frequency and duration of logos in videos is important and challenging in the advertisement industry as a way of estimating the impact of ad purchases. Since logos occupy only a small area in the videos, the popular methods of image retrieval could fail. This paper develops an algorithm called Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm based on the spatial distribution of local image descriptors that measure the distance between the query image (the logo) and a collection of video images. VLR uses local features to overcome the weakness of global feature-based models such as convolutional neural networks (CNN). Meanwhile, VLR is flexible and does not require training after setting some hyper-parameters. The performance of VLR is evaluated on two challenging open benchmark tasks (SoccerNet and Standford I2V), and compared with other state-of-the-art logo retrieval or detection algorithms. Overall, VLR shows significantly higher accuracy compared with the existing methods.
△ Less
Submitted 18 May, 2020; v1 submitted 10 August, 2018;
originally announced August 2018.
-
Killing Two Birds with One Stone: Malicious Domain Detection with High Accuracy and Coverage
Authors:
Issa Khalil,
Bei Guan,
Mohamed Nabeel,
Ting Yu
Abstract:
Inference based techniques are one of the major approaches to analyze DNS data and detecting malicious domains. The key idea of inference techniques is to first define associations between domains based on features extracted from DNS data. Then, an inference algorithm is deployed to infer potential malicious domains based on their direct/indirect associations with known malicious ones. The way ass…
▽ More
Inference based techniques are one of the major approaches to analyze DNS data and detecting malicious domains. The key idea of inference techniques is to first define associations between domains based on features extracted from DNS data. Then, an inference algorithm is deployed to infer potential malicious domains based on their direct/indirect associations with known malicious ones. The way associations are defined is key to the effectiveness of an inference technique. It is desirable to be both accurate (i.e., avoid falsely associating domains with no meaningful connections) and with good coverage (i.e., identify all associations between domains with meaningful connections). Due to the limited scope of information provided by DNS data, it becomes a challenge to design an association scheme that achieves both high accuracy and good coverage.
In this paper, we propose a new association scheme to identify domains controlled by the same entity. Our key idea is an in-depth analysis of active DNS data to accurately separate public IPs from dedicated ones, which enables us to build high-quality associations between domains. Our scheme identifies many meaningful connections between domains that are discarded by existing state-of-the-art approaches. Our experimental results show that the proposed association scheme not only significantly improves the domain coverage compared to existing approaches but also achieves better detection accuracy.
Existing path-based inference algorithm is specifically designed for DNS data analysis. It is effective but computationally expensive. As a solution, we investigate the effectiveness of combining our association scheme with the generic belief propagation algorithm. Through comprehensive experiments, we show that this approach offers significant efficiency and scalability improvement with only minor negative impact of detection accuracy.
△ Less
Submitted 1 November, 2017;
originally announced November 2017.
-
A fast eikonal equation solver using the Schrodinger wave equation
Authors:
Karthik S. Gurumoorthy,
Adrian M. Peter,
Birmingham Hang Guan,
Anand Rangarajan
Abstract:
We use a Schrödinger wave equation formalism to solve the eikonal equation. In our framework, a solution to the eikonal equation is obtained in the limit as Planck's constant $\hbar$ (treated as a free parameter) tends to zero of the solution to the corresponding linear Schrödinger equation. The Schrödinger equation corresponding to the eikonal turns out to be a \emph{generalized, screened Poisson…
▽ More
We use a Schrödinger wave equation formalism to solve the eikonal equation. In our framework, a solution to the eikonal equation is obtained in the limit as Planck's constant $\hbar$ (treated as a free parameter) tends to zero of the solution to the corresponding linear Schrödinger equation. The Schrödinger equation corresponding to the eikonal turns out to be a \emph{generalized, screened Poisson equation}. Despite being linear, it does not have a closed-form solution for arbitrary forcing functions. We present two different techniques to solve the screened Poisson equation. In the first approach we use a standard perturbation analysis approach to derive a new algorithm which is guaranteed to converge provided the forcing function is bounded and positive. The perturbation technique requires a sequence of discrete convolutions which can be performed in $O(N\log N)$ using the Fast Fourier Transform (FFT) where $N$ is the number of grid points. In the second method we discretize the linear Laplacian operator by the finite difference method leading to a sparse linear system of equations which can be solved using the plethora of sparse solvers. The eikonal solution is recovered from the exponent of the resultant scalar field. Our approach eliminates the need to explicitly construct viscosity solutions as customary with direct solutions to the eikonal. Since the linear equation is computed for a small but non-zero $\hbar$, the obtained solution is an approximation. Though our solution framework is applicable to the general class of eikonal problems, we detail specifics for the popular vision applications of shape-from-shading, vessel segmentation, and path planning.
△ Less
Submitted 8 February, 2015; v1 submitted 8 March, 2014;
originally announced March 2014.