Search | arXiv e-print repository

Lay Text Summarisation Using Natural Language Processing: A Narrative Literature Review

Authors: Oliver Vinzelberg, Mark David Jenkins, Gordon Morison, David McMinn, Zoe Tieges

Abstract: Summarisation of research results in plain language is crucial for promoting public understanding of research findings. The use of Natural Language Processing to generate lay summaries has the potential to relieve researchers' workload and bridge the gap between science and society. The aim of this narrative literature review is to describe and compare the different text summarisation approaches u… ▽ More Summarisation of research results in plain language is crucial for promoting public understanding of research findings. The use of Natural Language Processing to generate lay summaries has the potential to relieve researchers' workload and bridge the gap between science and society. The aim of this narrative literature review is to describe and compare the different text summarisation approaches used to generate lay summaries. We searched the databases Web of Science, Google Scholar, IEEE Xplore, Association for Computing Machinery Digital Library and arXiv for articles published until 6 May 2022. We included original studies on automatic text summarisation methods to generate lay summaries. We screened 82 articles and included eight relevant papers published between 2020 and 2021, all using the same dataset. The results show that transformer-based methods such as Bidirectional Encoder Representations from Transformers (BERT) and Pre-training with Extracted Gap-sentences for Abstractive Summarization (PEGASUS) dominate the landscape of lay text summarisation, with all but one study using these methods. A combination of extractive and abstractive summarisation methods in a hybrid approach was found to be most effective. Furthermore, pre-processing approaches to input text (e.g. applying extractive summarisation) or determining which sections of a text to include, appear critical. Evaluation metrics such as Recall-Oriented Understudy for Gisting Evaluation (ROUGE) were used, which do not consider readability. To conclude, automatic lay text summarisation is under-explored. Future research should consider long document lay text summarisation, including clinical trial reports, and the development of evaluation metrics that consider readability of the lay summary. △ Less

Submitted 24 March, 2023; originally announced March 2023.

arXiv:2202.03714 [pdf, other]

What's Cracking? A Review and Analysis of Deep Learning Methods for Structural Crack Segmentation, Detection and Quantification

Authors: Jacob König, Mark Jenkins, Mike Mannion, Peter Barrie, Gordon Morison

Abstract: Surface cracks are a very common indicator of potential structural faults. Their early detection and monitoring is an important factor in structural health monitoring. Left untreated, they can grow in size over time and require expensive repairs or maintenance. With recent advances in computer vision and deep learning algorithms, the automatic detection and segmentation of cracks for this monitori… ▽ More Surface cracks are a very common indicator of potential structural faults. Their early detection and monitoring is an important factor in structural health monitoring. Left untreated, they can grow in size over time and require expensive repairs or maintenance. With recent advances in computer vision and deep learning algorithms, the automatic detection and segmentation of cracks for this monitoring process have become a major topic of interest. This review aims to give researchers an overview of the published work within the field of crack analysis algorithms that make use of deep learning. It outlines the various tasks that are solved through applying computer vision algorithms to surface cracks in a structural health monitoring setting and also provides in-depth reviews of recent fully, semi and unsupervised approaches that perform crack classification, detection, segmentation and quantification. Additionally, this review also highlights popular datasets used for cracks and the metrics that are used to evaluate the performance of those algorithms. Finally, potential research gaps are outlined and further research directions are provided. △ Less

Submitted 8 February, 2022; originally announced February 2022.

arXiv:2109.00456 [pdf, other]

Weakly-Supervised Surface Crack Segmentation by Generating Pseudo-Labels using Localization with a Classifier and Thresholding

Authors: Jacob König, Mark Jenkins, Mike Mannion, Peter Barrie, Gordon Morison

Abstract: Surface cracks are a common sight on public infrastructure nowadays. Recent work has been addressing this problem by supporting structural maintenance measures using machine learning methods. Those methods are used to segment surface cracks from their background, making them easier to localize. However, a common issue is that to create a well-functioning algorithm, the training data needs to have… ▽ More Surface cracks are a common sight on public infrastructure nowadays. Recent work has been addressing this problem by supporting structural maintenance measures using machine learning methods. Those methods are used to segment surface cracks from their background, making them easier to localize. However, a common issue is that to create a well-functioning algorithm, the training data needs to have detailed annotations of pixels that belong to cracks. Our work proposes a weakly supervised approach that leverages a CNN classifier in a novel way to create surface crack pseudo labels. First, we use the classifier to create a rough crack localization map by using its class activation maps and a patch based classification approach and fuse this with a thresholding based approach to segment the mostly darker crack pixels. The classifier assists in suppressing noise from the background regions, which commonly are incorrectly highlighted as cracks by standard thresholding methods. Then, the pseudo labels can be used in an end-to-end approach when training a standard CNN for surface crack segmentation. Our method is shown to yield sufficiently accurate pseudo labels. Those labels, incorporated into segmentation CNN training using multiple recent crack segmentation architectures, achieve comparable performance to fully supervised methods on four popular crack segmentation datasets. △ Less

Submitted 27 October, 2021; v1 submitted 1 September, 2021; originally announced September 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2008.06266 [pdf, other]

doi 10.1016/j.dsp.2020.102907

Optimized Deep Encoder-Decoder Methods for Crack Segmentation

Authors: Jacob König, Mark Jenkins, Mike Mannion, Peter Barrie, Gordon Morison

Abstract: Surface crack segmentation poses a challenging computer vision task as background, shape, colour and size of cracks vary. In this work we propose optimized deep encoder-decoder methods consisting of a combination of techniques which yield an increase in crack segmentation performance. Specifically we propose a decoder-part for an encoder-decoder based deep learning architecture for semantic segmen… ▽ More Surface crack segmentation poses a challenging computer vision task as background, shape, colour and size of cracks vary. In this work we propose optimized deep encoder-decoder methods consisting of a combination of techniques which yield an increase in crack segmentation performance. Specifically we propose a decoder-part for an encoder-decoder based deep learning architecture for semantic segmentation and study its components to achieve increased performance. We also examine the use of different encoder strategies and introduce a data augmentation policy to increase the amount of available training data. The performance evaluation of our method is carried out on four publicly available crack segmentation datasets. Additionally, we introduce two techniques into the field of surface crack segmentation, previously not used there: Generating results using test-time-augmentation and performing a statistical result analysis over multiple training runs. The former approach generally yields increased performance results, whereas the latter allows for more reproducible and better representability of a methods results. Using those aforementioned strategies with our proposed encoder-decoder architecture we are able to achieve new state of the art results in all datasets. △ Less

Submitted 26 August, 2021; v1 submitted 14 August, 2020; originally announced August 2020.

Comments: Accepted Manuscript at Digital Signal Processing (Elsevier)

Showing 1–4 of 4 results for author: Morison, G