-
A Review of Image Processing Methods in Prostate Ultrasound
Authors:
Haiqiao Wang,
Hong Wu,
Zhuoyuan Wang,
Peiyan Yue,
Dong Ni,
Pheng-Ann Heng,
Yi Wang
Abstract:
Prostate cancer (PCa) poses a significant threat to men's health, with early diagnosis being crucial for improving prognosis and reducing mortality rates. Transrectal ultrasound (TRUS) plays a vital role in the diagnosis and image-guided intervention of PCa.To facilitate physicians with more accurate and efficient computer-assisted diagnosis and interventions, many image processing algorithms in T…
▽ More
Prostate cancer (PCa) poses a significant threat to men's health, with early diagnosis being crucial for improving prognosis and reducing mortality rates. Transrectal ultrasound (TRUS) plays a vital role in the diagnosis and image-guided intervention of PCa.To facilitate physicians with more accurate and efficient computer-assisted diagnosis and interventions, many image processing algorithms in TRUS have been proposed and achieved state-of-the-art performance in several tasks, including prostate gland segmentation, prostate image registration, PCa classification and detection, and interventional needle detection.The rapid development of these algorithms over the past two decades necessitates a comprehensive summary. In consequence, this survey provides a systematic analysis of this field, outlining the evolution of image processing methods in the context of TRUS image analysis and meanwhile highlighting their relevant contributions. Furthermore, this survey discusses current challenges and suggests future research directions to possibly advance this field further.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Research on Foundation Model for Spatial Data Intelligence: China's 2024 White Paper on Strategic Development of Spatial Data Intelligence
Authors:
Shaohua Wang,
Xing Xie,
Yong Li,
Danhuai Guo,
Zhi Cai,
Yu Liu,
Yang Yue,
Xiao Pan,
Feng Lu,
Huayi Wu,
Zhipeng Gui,
Zhiming Ding,
Bolong Zheng,
Fuzheng Zhang,
Tao Qin,
Jingyuan Wang,
Chuang Tao,
Zhengchao Chen,
Hao Lu,
Jiayi Li,
Hongyang Chen,
Peng Yue,
Wenhao Yu,
Yao Yao,
Leilei Sun
, et al. (9 additional authors not shown)
Abstract:
This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial dat…
▽ More
This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial data intelligent large models and their applications in urban environments, aerospace remote sensing, geography, transportation, and other scenarios. Additionally, it summarizes the latest application cases of spatial data intelligent large models in themes such as urban development, multimodal systems, remote sensing, smart transportation, and resource environments. Finally, the report concludes with an overview and outlook on the development prospects of spatial data intelligent large models.
△ Less
Submitted 29 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Reinforcement Learning for Scalable Train Timetable Rescheduling with Graph Representation
Authors:
Peng Yue,
Yaochu Jin,
Xuewu Dai,
Zhenhua Feng,
Dongliang Cui
Abstract:
Train timetable rescheduling (TTR) aims to promptly restore the original operation of trains after unexpected disturbances or disruptions. Currently, this work is still done manually by train dispatchers, which is challenging to maintain performance under various problem instances. To mitigate this issue, this study proposes a reinforcement learning-based approach to TTR, which makes the following…
▽ More
Train timetable rescheduling (TTR) aims to promptly restore the original operation of trains after unexpected disturbances or disruptions. Currently, this work is still done manually by train dispatchers, which is challenging to maintain performance under various problem instances. To mitigate this issue, this study proposes a reinforcement learning-based approach to TTR, which makes the following contributions compared to existing work. First, we design a simple directed graph to represent the TTR problem, enabling the automatic extraction of informative states through graph neural networks. Second, we reformulate the construction process of TTR's solution, not only decoupling the decision model from the problem size but also ensuring the generated scheme's feasibility. Third, we design a learning curriculum for our model to handle the scenarios with different levels of delay. Finally, a simple local search method is proposed to assist the learned decision model, which can significantly improve solution quality with little additional computation cost, further enhancing the practical value of our method. Extensive experimental results demonstrate the effectiveness of our method. The learned decision model can achieve better performance for various problems with varying degrees of train delay and different scales when compared to handcrafted rules and state-of-the-art solvers.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization
Authors:
Yuanshi Liu,
Hanzhen Zhao,
Yang Xu,
Pengyun Yue,
Cong Fang
Abstract:
Gradient-based minimax optimal algorithms have greatly promoted the development of continuous optimization and machine learning. One seminal work due to Yurii Nesterov [Nes83a] established $\tilde{\mathcal{O}}(\sqrt{L/μ})$ gradient complexity for minimizing an $L$-smooth $μ$-strongly convex objective. However, an ideal algorithm would adapt to the explicit complexity of a particular objective func…
▽ More
Gradient-based minimax optimal algorithms have greatly promoted the development of continuous optimization and machine learning. One seminal work due to Yurii Nesterov [Nes83a] established $\tilde{\mathcal{O}}(\sqrt{L/μ})$ gradient complexity for minimizing an $L$-smooth $μ$-strongly convex objective. However, an ideal algorithm would adapt to the explicit complexity of a particular objective function and incur faster rates for simpler problems, triggering our reconsideration of two defeats of existing optimization modeling and analysis. (i) The worst-case optimality is neither the instance optimality nor such one in reality. (ii) Traditional $L$-smoothness condition may not be the primary abstraction/characterization for modern practical problems.
In this paper, we open up a new way to design and analyze gradient-based algorithms with direct applications in machine learning, including linear regression and beyond. We introduce two factors $(α, τ_α)$ to refine the description of the degenerated condition of the optimization problems based on the observation that the singular values of Hessian often drop sharply. We design adaptive algorithms that solve simpler problems without pre-known knowledge with reduced gradient or analogous oracle accesses. The algorithms also improve the state-of-art complexities for several problems in machine learning, thereby solving the open problem of how to design faster algorithms in light of the known complexity lower bounds. Specially, with the $\mathcal{O}(1)$-nuclear norm bounded, we achieve an optimal $\tilde{\mathcal{O}}(μ^{-1/3})$ (v.s. $\tilde{\mathcal{O}}(μ^{-1/2})$) gradient complexity for linear regression. We hope this work could invoke the rethinking for understanding the difficulty of modern problems in optimization.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity
Authors:
Pengyun Yue,
Hanzhen Zhao,
Cong Fang,
Di He,
Liwei Wang,
Zhouchen Lin,
Song-chun Zhu
Abstract:
With distributed machine learning being a prominent technique for large-scale machine learning tasks, communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers. In this paper, we propose a new technique named Common randOm REconstruction(CORE), which can be used to compress the information transmitted between machines in order to reduce communic…
▽ More
With distributed machine learning being a prominent technique for large-scale machine learning tasks, communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers. In this paper, we propose a new technique named Common randOm REconstruction(CORE), which can be used to compress the information transmitted between machines in order to reduce communication complexity without other strict conditions. Especially, our technique CORE projects the vector-valued information to a low-dimensional one through common random vectors and reconstructs the information with the same random noises after communication. We apply CORE to two distributed tasks, respectively convex optimization on linear models and generic non-convex optimization, and design new distributed algorithms, which achieve provably lower communication complexities. For example, we show for linear models CORE-based algorithm can encode the gradient vector to $\mathcal{O}(1)$-bits (against $\mathcal{O}(d)$), with the convergence rate not worse, preceding the existing results.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video
Authors:
Ziyang Hong,
C. Patrick Yue
Abstract:
We present a novel real-time capable learning method that jointly perceives a 3D scene's geometry structure and semantic labels. Recent approaches to real-time 3D scene reconstruction mostly adopt a volumetric scheme, where a Truncated Signed Distance Function (TSDF) is directly regressed. However, these volumetric approaches tend to focus on the global coherence of their reconstructions, which le…
▽ More
We present a novel real-time capable learning method that jointly perceives a 3D scene's geometry structure and semantic labels. Recent approaches to real-time 3D scene reconstruction mostly adopt a volumetric scheme, where a Truncated Signed Distance Function (TSDF) is directly regressed. However, these volumetric approaches tend to focus on the global coherence of their reconstructions, which leads to a lack of local geometric detail. To overcome this issue, we propose to leverage the latent geometric prior knowledge in 2D image features by explicit depth prediction and anchored feature generation, to refine the occupancy learning in TSDF volume. Besides, we find that this cross-dimensional feature refinement methodology can also be adopted for the semantic segmentation task by utilizing semantic priors. Hence, we proposed an end-to-end cross-dimensional refinement neural network (CDRNet) to extract both 3D mesh and 3D semantic labeling in real time. The experiment results show that this method achieves a state-of-the-art 3D perception efficiency on multiple datasets, which indicates the great potential of our method for industrial applications.
△ Less
Submitted 10 September, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Improving Speech Emotion Recognition with Unsupervised Speaking Style Transfer
Authors:
Leyuan Qu,
Wei Wang,
Cornelius Weber,
Pengcheng Yue,
Taihao Li,
Stefan Wermter
Abstract:
Humans can effortlessly modify various prosodic attributes, such as the placement of stress and the intensity of sentiment, to convey a specific emotion while maintaining consistent linguistic content. Motivated by this capability, we propose EmoAug, a novel style transfer model designed to enhance emotional expression and tackle the data scarcity issue in speech emotion recognition tasks. EmoAug…
▽ More
Humans can effortlessly modify various prosodic attributes, such as the placement of stress and the intensity of sentiment, to convey a specific emotion while maintaining consistent linguistic content. Motivated by this capability, we propose EmoAug, a novel style transfer model designed to enhance emotional expression and tackle the data scarcity issue in speech emotion recognition tasks. EmoAug consists of a semantic encoder and a paralinguistic encoder that represent verbal and non-verbal information respectively. Additionally, a decoder reconstructs speech signals by conditioning on the aforementioned two information flows in an unsupervised fashion. Once training is completed, EmoAug enriches expressions of emotional speech with different prosodic attributes, such as stress, rhythm and intensity, by feeding different styles into the paralinguistic encoder. EmoAug enables us to generate similar numbers of samples for each class to tackle the data imbalance issue as well. Experimental results on the IEMOCAP dataset demonstrate that EmoAug can successfully transfer different speaking styles while retaining the speaker identity and semantic content. Furthermore, we train a SER model with data augmented by EmoAug and show that the augmented model not only surpasses the state-of-the-art supervised and self-supervised methods but also overcomes overfitting problems caused by data imbalance. Some audio samples can be found on our demo website.
△ Less
Submitted 28 December, 2023; v1 submitted 16 November, 2022;
originally announced November 2022.
-
Technology Report : Robotic Localization and Navigation System for Visible Light Positioning and SLAM
Authors:
Weipeng Guan,
Patrick Yue
Abstract:
Visible light positioning (VLP) technology is a promising technique as it can provide high accuracy positioning based on the existing lighting infrastructure. However, existing approaches often require dense lighting distributions. Additionally, due to complicated indoor environments, it is still challenging to develop a robust VLP. In this work, we proposed loosely-coupled multi-sensor fusion met…
▽ More
Visible light positioning (VLP) technology is a promising technique as it can provide high accuracy positioning based on the existing lighting infrastructure. However, existing approaches often require dense lighting distributions. Additionally, due to complicated indoor environments, it is still challenging to develop a robust VLP. In this work, we proposed loosely-coupled multi-sensor fusion method based on VLP and Simultaneous Localization and Mapping (SLAM), with light detection and ranging (LiDAR), odometry, and rolling shutter camera. Our method can provide accurate and robust robotics localization and navigation in LED-shortage or even outage situations. The efficacy of the proposed scheme is verified by extensive real-time experiment. The results show that our proposed scheme can provide an average accuracy of 2 cm and the average computational time in low-cost embedded platforms is around 50 ms.
△ Less
Submitted 2 September, 2021; v1 submitted 30 April, 2021;
originally announced April 2021.
-
Efficient Training Convolutional Neural Networks on Edge Devices with Gradient-pruned Sign-symmetric Feedback Alignment
Authors:
Ziyang Hong,
C. Patrick Yue
Abstract:
With the prosperity of mobile devices, the distributed learning approach enabling model training with decentralized data has attracted wide research. However, the lack of training capability for edge devices significantly limits the energy efficiency of distributed learning in real life. This paper describes a novel approach of training DNNs exploiting the redundancy and the weight asymmetry poten…
▽ More
With the prosperity of mobile devices, the distributed learning approach enabling model training with decentralized data has attracted wide research. However, the lack of training capability for edge devices significantly limits the energy efficiency of distributed learning in real life. This paper describes a novel approach of training DNNs exploiting the redundancy and the weight asymmetry potential of conventional backpropagation. We demonstrate that with negligible classification accuracy loss, the proposed approach outperforms the prior arts by 5x in terms of energy efficiency.
△ Less
Submitted 12 May, 2021; v1 submitted 4 March, 2021;
originally announced March 2021.
-
Heterogeneous Swarms for Maritime Dynamic Target Search and Tracking
Authors:
Hian Lee Kwa,
Grgur Tokić,
Roland Bouffanais,
Dick K. P. Yue
Abstract:
Current strategies employed for maritime target search and tracking are primarily based on the use of agents following a predetermined path to perform a systematic sweep of a search area. Recently, dynamic Particle Swarm Optimization (PSO) algorithms have been used together with swarming multi-robot systems (MRS), giving search and tracking solutions the added properties of robustness, scalability…
▽ More
Current strategies employed for maritime target search and tracking are primarily based on the use of agents following a predetermined path to perform a systematic sweep of a search area. Recently, dynamic Particle Swarm Optimization (PSO) algorithms have been used together with swarming multi-robot systems (MRS), giving search and tracking solutions the added properties of robustness, scalability, and flexibility. Swarming MRS also give the end-user the opportunity to incrementally upgrade the robotic system, inevitably leading to the use of heterogeneous swarming MRS. However, such systems have not been well studied and incorporating upgraded agents into a swarm may result in degraded mission performances. In this paper, we propose a PSO-based strategy using a topological k-nearest neighbor graph with tunable exploration and exploitation dynamics with an adaptive repulsion parameter. This strategy is implemented within a simulated swarm of 50 agents with varying proportions of fast agents tracking a target represented by a fictitious binary function. Through these simulations, we are able to demonstrate an increase in the swarm's collective response level and target tracking performance by substituting in a proportion of fast buoys.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Gradual Collective Upgrade of a Swarm of Autonomous Buoys for Dynamic Ocean Monitoring
Authors:
Francesco Vallegra,
David Mateo,
Grgur Tokić,
Roland Bouffanais,
Dick K. P. Yue
Abstract:
Swarms of autonomous surface vehicles equipped with environmental sensors and decentralized communications bring a new wave of attractive possibilities for the monitoring of dynamic features in oceans and other waterbodies. However, a key challenge in swarm robotics design is the efficient collective operation of heterogeneous systems. We present both theoretical analysis and field experiments on…
▽ More
Swarms of autonomous surface vehicles equipped with environmental sensors and decentralized communications bring a new wave of attractive possibilities for the monitoring of dynamic features in oceans and other waterbodies. However, a key challenge in swarm robotics design is the efficient collective operation of heterogeneous systems. We present both theoretical analysis and field experiments on the responsiveness in dynamic area coverage of a collective of 22 autonomous buoys, where 4 units are upgraded to a new design that allows them to move 80\% faster than the rest. This system is able to react on timescales of the minute to changes in areas on the order of a few thousand square meters. We have observed that this partial upgrade of the system significantly increases its average responsiveness, without necessarily improving the spatial uniformity of the deployment. These experiments show that the autonomous buoy designs and the cooperative control rule described in this work provide an efficient, flexible, and scalable solution for the pervasive and persistent monitoring of water environments.
△ Less
Submitted 31 August, 2018;
originally announced August 2018.
-
Swarm-Enabling Technology for Multi-Robot Systems
Authors:
Mohammadreza Chamanbaz,
David Mateo,
Brandon M. Zoss,
Grgur Tokić,
Erik Wilhelm,
Roland Bouffanais,
and Dick K. P. Yue
Abstract:
Swarm robotics has experienced a rapid expansion in recent years, primarily fueled by specialized multi-robot systems developed to achieve dedicated collective actions. These specialized platforms are in general designed with swarming considerations at the front and center. Key hardware and software elements required for swarming are often deeply embedded and integrated with the particular system.…
▽ More
Swarm robotics has experienced a rapid expansion in recent years, primarily fueled by specialized multi-robot systems developed to achieve dedicated collective actions. These specialized platforms are in general designed with swarming considerations at the front and center. Key hardware and software elements required for swarming are often deeply embedded and integrated with the particular system. However, given the noticeable increase in the number of low-cost mobile robots readily available, practitioners and hobbyists may start considering to assemble full-fledged swarms by minimally retrofitting such mobile platforms with a swarm-enabling technology. Here, we report one possible embodiment of such a technology designed to enable the assembly and the study of swarming in a range of general-purpose robotic systems. This is achieved by combining a modular and transferable software toolbox with a hardware suite composed of a collection of low-cost and off-the-shelf components. The developed technology can be ported to a relatively vast range of robotic platforms with minimal changes and high levels of scalability. This swarm-enabling technology has successfully been implemented on two distinct distributed multi-robot systems, a swarm of mobile marine buoys and a team of commercial terrestrial robots. We have tested the effectiveness of both of these distributed robotic systems in performing collective exploration and search scenarios, as well as other classical cooperative behaviors. Experimental results on different swarm behaviors are reported for the two platforms in uncontrolled environments and without any supporting infrastructure. The design of the associated software library allows for a seamless switch to other cooperative behaviors, and also offers the possibility to simulate newly designed collective behaviors prior to their implementation onto the platforms.
△ Less
Submitted 11 May, 2017;
originally announced May 2017.