-
DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Authors:
Qi Wang,
Zhou Xu,
Yuming Lin,
Jingtao Ye,
Hongsheng Li,
Guangming Zhu,
Syed Afaq Ali Shah,
Mohammed Bennamoun,
Liang Zhang
Abstract:
Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based cameras. The distinctive capabilities of event cameras have ignited significant interest in the domain of event-based action recognition, recognizing thei…
▽ More
Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based cameras. The distinctive capabilities of event cameras have ignited significant interest in the domain of event-based action recognition, recognizing their vast potential for advancement. However, the development in this field is currently slowed by the lack of comprehensive, large-scale datasets, which are critical for developing robust recognition frameworks. To bridge this gap, we introduces DailyDVS-200, a meticulously curated benchmark dataset tailored for the event-based action recognition community. DailyDVS-200 is extensive, covering 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences. This dataset is designed to reflect a broad spectrum of action types, scene complexities, and data acquisition diversity. Each sequence in the dataset is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions. Moreover, DailyDVS-200 is structured to facilitate a wide range of research paths, offering a solid foundation for both validating existing approaches and inspiring novel methodologies. By setting a new benchmark in the field, we challenge the current limitations of neuromorphic data processing and invite a surge of new approaches in event-based action recognition techniques, which paves the way for future explorations in neuromorphic computing and beyond. The dataset and source code are available at https://github.com/QiWang233/DailyDVS-200.
△ Less
Submitted 13 July, 2024; v1 submitted 6 July, 2024;
originally announced July 2024.
-
Finite Alphabet Fast List Decoders for Polar Codes
Authors:
Syed Aizaz Ali Shah,
Gerhard Bauch
Abstract:
The so-called fast polar decoding schedules are meant to improve the decoding speed of the sequential-natured successive cancellation list decoders. The decoding speedup is achieved by replacing various parts of the serial decoding process with efficient special-purpose decoder nodes. This work incorporates the fast decoding schedules for polar codes into their quantized finite alphabet decoding.…
▽ More
The so-called fast polar decoding schedules are meant to improve the decoding speed of the sequential-natured successive cancellation list decoders. The decoding speedup is achieved by replacing various parts of the serial decoding process with efficient special-purpose decoder nodes. This work incorporates the fast decoding schedules for polar codes into their quantized finite alphabet decoding. In a finite alphabet successive cancellation list decoder, the log-likelihood ratio computations are replaced with lookup operations on low-resolution integer messages. The lookup tables are designed using the information bottleneck method. It is shown that the finite alphabet decoders can also leverage the special decoder nodes found in the literature. Besides their inherent decoding speed improvement, the use of these special decoder nodes drastically reduces the number of lookup tables required to perform the finite alphabet decoding. In order to perform quantized decoding using lookup operations, the proposed decoders require up to 93% less unique lookup tables as compared to the ones that use the conventional successive cancellation schedule. Moreover, the proposed decoders exhibit negligible loss in error correction performance without necessitating alterations to the lookup table design process.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Language Model Guided Interpretable Video Action Reasoning
Authors:
Ning Wang,
Guangming Zhu,
HS Li,
Liang Zhang,
Syed Afaq Ali Shah,
Mohammed Bennamoun
Abstract:
While neural networks have excelled in video action recognition tasks, their black-box nature often obscures the understanding of their decision-making processes. Recent approaches used inherently interpretable models to analyze video actions in a manner akin to human reasoning. These models, however, usually fall short in performance compared to their black-box counterparts. In this work, we pres…
▽ More
While neural networks have excelled in video action recognition tasks, their black-box nature often obscures the understanding of their decision-making processes. Recent approaches used inherently interpretable models to analyze video actions in a manner akin to human reasoning. These models, however, usually fall short in performance compared to their black-box counterparts. In this work, we present a new framework named Language-guided Interpretable Action Recognition framework (LaIAR). LaIAR leverages knowledge from language models to enhance both the recognition capabilities and the interpretability of video models. In essence, we redefine the problem of understanding video model decisions as a task of aligning video and language models. Using the logical reasoning captured by the language model, we steer the training of the video model. This integrated approach not only improves the video model's adaptability to different domains but also boosts its overall performance. Extensive experiments on two complex video action datasets, Charades & CAD-120, validates the improved performance and interpretability of our LaIAR framework. The code of LaIAR is available at https://github.com/NingWang2049/LaIAR.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach
Authors:
Huanyu Liu,
Jianfeng Cai,
Tingjia Zhang,
Hongsheng Li,
Siyuan Wang,
Guangming Zhu,
Syed Afaq Ali Shah,
Mohammed Bennamoun,
Liang Zhang
Abstract:
Flowcharts and mind maps, collectively known as flowmind, are vital in daily activities, with hand-drawn versions facilitating real-time collaboration. However, there's a growing need to digitize them for efficient processing. Automated conversion methods are essential to overcome manual conversion challenges. Existing sketch recognition methods face limitations in practical situations, being fiel…
▽ More
Flowcharts and mind maps, collectively known as flowmind, are vital in daily activities, with hand-drawn versions facilitating real-time collaboration. However, there's a growing need to digitize them for efficient processing. Automated conversion methods are essential to overcome manual conversion challenges. Existing sketch recognition methods face limitations in practical situations, being field-specific and lacking digital conversion steps. Our paper introduces the Flowmind2digital method and hdFlowmind dataset to address these challenges. Flowmind2digital, utilizing neural networks and keypoint detection, achieves a record 87.3% accuracy on our dataset, surpassing previous methods by 11.9%. The hdFlowmind dataset, comprising 1,776 annotated flowminds across 22 scenarios, outperforms existing datasets. Additionally, our experiments emphasize the importance of simple graphics, enhancing accuracy by 9.3%.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
UE4-NeRF:Neural Radiance Field for Real-Time Rendering of Large-Scale Scene
Authors:
Jiaming Gu,
Minchao Jiang,
Hongsheng Li,
Xiaoyuan Lu,
Guangming Zhu,
Syed Afaq Ali Shah,
Liang Zhang,
Mohammed Bennamoun
Abstract:
Neural Radiance Fields (NeRF) is a novel implicit 3D reconstruction method that shows immense potential and has been gaining increasing attention. It enables the reconstruction of 3D scenes solely from a set of photographs. However, its real-time rendering capability, especially for interactive real-time rendering of large-scale scenes, still has significant limitations. To address these challenge…
▽ More
Neural Radiance Fields (NeRF) is a novel implicit 3D reconstruction method that shows immense potential and has been gaining increasing attention. It enables the reconstruction of 3D scenes solely from a set of photographs. However, its real-time rendering capability, especially for interactive real-time rendering of large-scale scenes, still has significant limitations. To address these challenges, in this paper, we propose a novel neural rendering system called UE4-NeRF, specifically designed for real-time rendering of large-scale scenes. We partitioned each large scene into different sub-NeRFs. In order to represent the partitioned independent scene, we initialize polygonal meshes by constructing multiple regular octahedra within the scene and the vertices of the polygonal faces are continuously optimized during the training process. Drawing inspiration from Level of Detail (LOD) techniques, we trained meshes of varying levels of detail for different observation levels. Our approach combines with the rasterization pipeline in Unreal Engine 4 (UE4), achieving real-time rendering of large-scale scenes at 4K resolution with a frame rate of up to 43 FPS. Rendering within UE4 also facilitates scene editing in subsequent stages. Furthermore, through experiments, we have demonstrated that our method achieves rendering quality comparable to state-of-the-art approaches. Project page: https://jamchaos.github.io/UE4-NeRF/.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Artificial Empathy Classification: A Survey of Deep Learning Techniques, Datasets, and Evaluation Scales
Authors:
Sharjeel Tahir,
Syed Afaq Shah,
Jumana Abu-Khalaf
Abstract:
From the last decade, researchers in the field of machine learning (ML) and assistive developmental robotics (ADR) have taken an interest in artificial empathy (AE) as a possible future paradigm for human-robot interaction (HRI). Humans learn empathy since birth, therefore, it is challenging to instill this sense in robots and intelligent machines. Nevertheless, by training over a vast amount of d…
▽ More
From the last decade, researchers in the field of machine learning (ML) and assistive developmental robotics (ADR) have taken an interest in artificial empathy (AE) as a possible future paradigm for human-robot interaction (HRI). Humans learn empathy since birth, therefore, it is challenging to instill this sense in robots and intelligent machines. Nevertheless, by training over a vast amount of data and time, imitating empathy, to a certain extent, can be possible for robots. Training techniques for AE, along with findings from the field of empathetic AI research, are ever-evolving. The standard workflow for artificial empathy consists of three stages: 1) Emotion Recognition (ER) using the retrieved features from video or textual data, 2) analyzing the perceived emotion or degree of empathy to choose the best course of action, and 3) carrying out a response action. Recent studies that show AE being used with virtual agents or robots often include Deep Learning (DL) techniques. For instance, models like VGGFace are used to conduct ER. Semi-supervised models like Autoencoders generate the corresponding emotional states and behavioral responses. However, there has not been any study that presents an independent approach for evaluating AE, or the degree to which a reaction was empathetic. This paper aims to investigate and evaluate existing works for measuring and evaluating empathy, as well as the datasets that have been collected and used so far. Our goal is to highlight and facilitate the use of state-of-the-art methods in the area of AE by comparing their performance. This will aid researchers in the area of AE in selecting their approaches with precision.
△ Less
Submitted 4 September, 2023;
originally announced October 2023.
-
Quantum-AI empowered Intelligent Surveillance: Advancing Public Safety Through Innovative Contraband Detection
Authors:
Syed Atif Ali Shah,
Nasir Algeelani,
Najeeb Al-Sammarraie
Abstract:
Surveillance systems have emerged as crucial elements in upholding peace and security in the modern world. Their ubiquity aids in monitoring suspicious activities effectively. However, in densely populated environments, continuous active monitoring becomes impractical, necessitating the development of intelligent surveillance systems. AI integration in the surveillance domain was a big revolution,…
▽ More
Surveillance systems have emerged as crucial elements in upholding peace and security in the modern world. Their ubiquity aids in monitoring suspicious activities effectively. However, in densely populated environments, continuous active monitoring becomes impractical, necessitating the development of intelligent surveillance systems. AI integration in the surveillance domain was a big revolution, however, speed issues have prevented its widespread implementation in the field. It has been observed that quantum artificial intelligence has led to a great breakthrough. Quantum artificial intelligence-based surveillance systems have shown to be more accurate as well as capable of performing well in real-time scenarios, which had never been seen before. In this research, a RentinaNet model is integrated with Quantum CNN and termed as Quantum-RetinaNet. By harnessing the Quantum capabilities of QCNN, Quantum-RetinaNet strikes a balance between accuracy and speed. This innovative integration positions it as a game-changer, addressing the challenges of active monitoring in densely populated scenarios. As demand for efficient surveillance solutions continues to grow, Quantum-RetinaNet offers a compelling alternative to existing CNN models, upholding accuracy standards without sacrificing real-time performance. The unique attributes of Quantum-RetinaNet have far-reaching implications for the future of intelligent surveillance. With its enhanced processing speed, it is poised to revolutionize the field, catering to the pressing need for rapid yet precise monitoring. As Quantum-RetinaNet becomes the new standard, it ensures public safety and security while pushing the boundaries of AI in surveillance.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes
Authors:
Pascal Giard,
Syed Aizaz Ali Shah,
Alexios Balatsoukas-Stimming,
Maximilian Stark,
Gerhard Bauch
Abstract:
Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throug…
▽ More
Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throughput, latency) may turn out more challenging than expected. We present three variants of LUT-based decoders and describe their inner workings as well as circuits in detail. The LUT-based decoders are compared against a regular unrolled decoder, employing fixed-point representations for numbers, with a comparable error-correction performance. A short systematic polar code is used as an illustration. All resulting unrolled decoders are shown to be capable of an information throughput of little under 10 Gbps in a 28 nm FD-SOI technology clocked in the vicinity of 1.4 GHz to 1.5 GHz. The best variant of our LUT-based decoders is shown to reduce the area requirements by 23% compared to the regular unrolled decoder while retaining a comparable error-correction performance.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Implementation-Efficient Finite Alphabet Decoding of Polar Codes
Authors:
Philipp Mohr,
Syed Aizaz Ali Shah,
Gerhard Bauch
Abstract:
An implementation-efficient finite alphabet decoder for polar codes relying on coarsely quantized messages and low-complexity operations is proposed. Typically, finite alphabet decoding performs concatenated compression operations on the received channel messages to aggregate compact reliability information for error correction. These compression operations or mappings can be considered as lookup…
▽ More
An implementation-efficient finite alphabet decoder for polar codes relying on coarsely quantized messages and low-complexity operations is proposed. Typically, finite alphabet decoding performs concatenated compression operations on the received channel messages to aggregate compact reliability information for error correction. These compression operations or mappings can be considered as lookup tables. For polar codes, the finite alphabet decoder design boils down to constructing lookup tables for the upper and lower branches of the building blocks within the code structure. A key challenge is to realize a hardware-friendly implementation of the lookup tables. This work uses the min-sum implementation for the upper branch lookup table and, as a novelty, a computational domain implementation for the lower branch lookup table. The computational domain approach drastically reduces the number of implementation parameters. Furthermore, a restriction to uniform quantization in the lower branch allows a very hardware-friendly compression via clipping and bit-shifting. Its behavior is close to the optimal non-uniform quantization, whose implementation would require multiple high-resolution threshold comparisons. Simulation results confirm excellent performance for the developed decoder. Unlike conventional fixed-point decoders, the proposed method involves an offline design that explicitly maximizes the preserved mutual information under coarse quantization.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Contactless Human Activity Recognition using Deep Learning with Flexible and Scalable Software Define Radio
Authors:
Muhammad Zakir Khan,
Jawad Ahmad,
Wadii Boulila,
Matthew Broadbent,
Syed Aziz Shah,
Anis Koubaa,
Qammer H. Abbasi
Abstract:
Ambient computing is gaining popularity as a major technological advancement for the future. The modern era has witnessed a surge in the advancement in healthcare systems, with viable radio frequency solutions proposed for remote and unobtrusive human activity recognition (HAR). Specifically, this study investigates the use of Wi-Fi channel state information (CSI) as a novel method of ambient sens…
▽ More
Ambient computing is gaining popularity as a major technological advancement for the future. The modern era has witnessed a surge in the advancement in healthcare systems, with viable radio frequency solutions proposed for remote and unobtrusive human activity recognition (HAR). Specifically, this study investigates the use of Wi-Fi channel state information (CSI) as a novel method of ambient sensing that can be employed as a contactless means of recognizing human activity in indoor environments. These methods avoid additional costly hardware required for vision-based systems, which are privacy-intrusive, by (re)using Wi-Fi CSI for various safety and security applications. During an experiment utilizing universal software-defined radio (USRP) to collect CSI samples, it was observed that a subject engaged in six distinct activities, which included no activity, standing, sitting, and leaning forward, across different areas of the room. Additionally, more CSI samples were collected when the subject walked in two different directions. This study presents a Wi-Fi CSI-based HAR system that assesses and contrasts deep learning approaches, namely convolutional neural network (CNN), long short-term memory (LSTM), and hybrid (LSTM+CNN), employed for accurate activity recognition. The experimental results indicate that LSTM surpasses current models and achieves an average accuracy of 95.3% in multi-activity classification when compared to CNN and hybrid techniques. In the future, research needs to study the significance of resilience in diverse and dynamic environments to identify the activity of multiple users.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Implementation of a Modified U-Net for Medical Image Segmentation on Edge Devices
Authors:
Owais Ali,
Hazrat Ali,
Syed Ayaz Ali Shah,
Aamir Shahzad
Abstract:
Deep learning techniques, particularly convolutional neural networks, have shown great potential in computer vision and medical imaging applications. However, deep learning models are computationally demanding as they require enormous computational power and specialized processing hardware for model training. To make these models portable and compatible for prototyping, their implementation on low…
▽ More
Deep learning techniques, particularly convolutional neural networks, have shown great potential in computer vision and medical imaging applications. However, deep learning models are computationally demanding as they require enormous computational power and specialized processing hardware for model training. To make these models portable and compatible for prototyping, their implementation on low-power devices is imperative. In this work, we present the implementation of Modified U-Net on Intel Movidius Neural Compute Stick 2 (NCS-2) for the segmentation of medical images. We selected U-Net because, in medical image segmentation, U-Net is a prominent model that provides improved performance for medical image segmentation even if the dataset size is small. The modified U-Net model is evaluated for performance in terms of dice score. Experiments are reported for segmentation task on three medical imaging datasets: BraTs dataset of brain MRI, heart MRI dataset, and Ziehl-Neelsen sputum smear microscopy image (ZNSDB) dataset. For the proposed model, we reduced the number of parameters from 30 million in the U-Net model to 0.49 million in the proposed architecture. Experimental results show that the modified U-Net provides comparable performance while requiring significantly lower resources and provides inference on the NCS-2. The maximum dice scores recorded are 0.96 for the BraTs dataset, 0.94 for the heart MRI dataset, and 0.74 for the ZNSDB dataset.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
A large scale multi-view RGBD visual affordance learning dataset
Authors:
Zeyad Khalifa,
Syed Afaq Ali Shah
Abstract:
The physical and textural attributes of objects have been widely studied for recognition, detection and segmentation tasks in computer vision.~A number of datasets, such as large scale ImageNet, have been proposed for feature learning using data hungry deep neural networks and for hand-crafted feature extraction. To intelligently interact with objects, robots and intelligent machines need the abil…
▽ More
The physical and textural attributes of objects have been widely studied for recognition, detection and segmentation tasks in computer vision.~A number of datasets, such as large scale ImageNet, have been proposed for feature learning using data hungry deep neural networks and for hand-crafted feature extraction. To intelligently interact with objects, robots and intelligent machines need the ability to infer beyond the traditional physical/textural attributes, and understand/learn visual cues, called visual affordances, for affordance recognition, detection and segmentation. To date there is no publicly available large dataset for visual affordance understanding and learning. In this paper, we introduce a large scale multi-view RGBD visual affordance learning dataset, a benchmark of 47210 RGBD images from 37 object categories, annotated with 15 visual affordance categories. To the best of our knowledge, this is the first ever and the largest multi-view RGBD visual affordance learning dataset. We benchmark the proposed dataset for affordance segmentation and recognition tasks using popular Vision Transformer and Convolutional Neural Networks. Several state-of-the-art deep learning networks are evaluated each for affordance recognition and segmentation tasks. Our experimental results showcase the challenging nature of the dataset and present definite prospects for new and robust affordance learning algorithms. The dataset is publicly available at https://sites.google.com/view/afaqshah/dataset.
△ Less
Submitted 12 September, 2023; v1 submitted 26 March, 2022;
originally announced March 2022.
-
Scene Graph Generation: A Comprehensive Survey
Authors:
Guangming Zhu,
Liang Zhang,
Youliang Jiang,
Yixuan Dang,
Haoran Hou,
Peiyi Shen,
Mingtao Feng,
Xia Zhao,
Qiguang Miao,
Syed Afaq Ali Shah,
Mohammed Bennamoun
Abstract:
Deep learning techniques have led to remarkable breakthroughs in the field of generic object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semanti…
▽ More
Deep learning techniques have led to remarkable breakthroughs in the field of generic object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semantic structural scene graph, which requires the correct labeling of detected objects and their relationships. Although this is a challenging task, the community has proposed a lot of SGG approaches and achieved good results. In this paper, we provide a comprehensive survey of recent achievements in this field brought about by deep learning techniques. We review 138 representative works that cover different input modalities, and systematically summarize existing methods of image-based SGG from the perspective of feature extraction and fusion. We attempt to connect and systematize the existing visual relationship detection methods, to summarize, and interpret the mechanisms and the strategies of SGG in a comprehensive way. Finally, we finish this survey with deep discussions about current existing problems and future research directions. This survey will help readers to develop a better understanding of the current research status and ideas.
△ Less
Submitted 22 June, 2022; v1 submitted 2 January, 2022;
originally announced January 2022.
-
Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks
Authors:
Nima Mirnateghi,
Syed Afaq Ali Shah,
Mohammed Bennamoun
Abstract:
Deep learning has become an integral part of various computer vision systems in recent years due to its outstanding achievements for object recognition, facial recognition, and scene understanding. However, deep neural networks (DNNs) are susceptible to be fooled with nearly high confidence by an adversary. In practice, the vulnerability of deep learning systems against carefully perturbed images,…
▽ More
Deep learning has become an integral part of various computer vision systems in recent years due to its outstanding achievements for object recognition, facial recognition, and scene understanding. However, deep neural networks (DNNs) are susceptible to be fooled with nearly high confidence by an adversary. In practice, the vulnerability of deep learning systems against carefully perturbed images, known as adversarial examples, poses a dire security threat in the physical world applications. To address this phenomenon, we present, what to our knowledge, is the first ever image set based adversarial defence approach. Image set classification has shown an exceptional performance for object and face recognition, owing to its intrinsic property of handling appearance variability. We propose a robust deep Bayesian image set classification as a defence framework against a broad range of adversarial attacks. We extensively experiment the performance of the proposed technique with several voting strategies. We further analyse the effects of image size, perturbation magnitude, along with the ratio of perturbed images in each image set. We also evaluate our technique with the recent state-of-the-art defence methods, and single-shot recognition task. The empirical results demonstrate superior performance on CIFAR-10, MNIST, ETH-80, and Tiny ImageNet datasets.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
A Systematic Collection of Medical Image Datasets for Deep Learning
Authors:
Johann Li,
Guangming Zhu,
Cong Hua,
Mingtao Feng,
BasheerBennamoun,
Ping Li,
Xiaoyuan Lu,
Juan Song,
Peiyi Shen,
Xu Xu,
Lin Mei,
Liang Zhang,
Syed Afaq Ali Shah,
Mohammed Bennamoun
Abstract:
The astounding success made by artificial intelligence (AI) in healthcare and other fields proves that AI can achieve human-like performance. However, success always comes with challenges. Deep learning algorithms are data-dependent and require large datasets for training. The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analy…
▽ More
The astounding success made by artificial intelligence (AI) in healthcare and other fields proves that AI can achieve human-like performance. However, success always comes with challenges. Deep learning algorithms are data-dependent and require large datasets for training. The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analysis. Medical image acquisition, annotation, and analysis are costly, and their usage is constrained by ethical restrictions. They also require many resources, such as human expertise and funding. That makes it difficult for non-medical researchers to have access to useful and large medical data. Thus, as comprehensive as possible, this paper provides a collection of medical image datasets with their associated challenges for deep learning research. We have collected information of around three hundred datasets and challenges mainly reported between 2013 and 2020 and categorized them into four categories: head & neck, chest & abdomen, pathology & blood, and ``others''. Our paper has three purposes: 1) to provide a most up to date and complete list that can be used as a universal reference to easily find the datasets for clinical image analysis, 2) to guide researchers on the methodology to test and evaluate their methods' performance and robustness on relevant datasets, 3) to provide a ``route'' to relevant algorithms for the relevant medical topics, and challenge leaderboards.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
WEmbSim: A Simple yet Effective Metric for Image Captioning
Authors:
Naeha Sharif,
Lyndon White,
Mohammed Bennamoun,
Wei Liu,
Syed Afaq Ali Shah
Abstract:
The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings(MOWE) of captions can actually achieve a…
▽ More
The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings(MOWE) of captions can actually achieve a surprisingly high performance on unsupervised caption evaluation. This inspires our proposed work on an effective metric WEmbSim, which beats complex measures such as SPICE, CIDEr and WMD at system-level correlation with human judgments. Moreover, it also achieves the best accuracy at matching human consensus scores for caption pairs, against commonly used unsupervised methods. Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified.
△ Less
Submitted 24 December, 2020;
originally announced December 2020.
-
LCEval: Learned Composite Metric for Caption Evaluation
Authors:
Naeha Sharif,
Lyndon White,
Mohammed Bennamoun,
Wei Liu,
Syed Afaq Ali Shah
Abstract:
Automatic evaluation metrics hold a fundamental importance in the development and fine-grained analysis of captioning systems. While current evaluation metrics tend to achieve an acceptable correlation with human judgements at the system level, they fail to do so at the caption level. In this work, we propose a neural network-based learned metric to improve the caption-level caption evaluation. To…
▽ More
Automatic evaluation metrics hold a fundamental importance in the development and fine-grained analysis of captioning systems. While current evaluation metrics tend to achieve an acceptable correlation with human judgements at the system level, they fail to do so at the caption level. In this work, we propose a neural network-based learned metric to improve the caption-level caption evaluation. To get a deeper insight into the parameters which impact a learned metrics performance, this paper investigates the relationship between different linguistic features and the caption-level correlation of the learned metrics. We also compare metrics trained with different training examples to measure the variations in their evaluation. Moreover, we perform a robustness analysis, which highlights the sensitivity of learned and handcrafted metrics to various sentence perturbations. Our empirical analysis shows that our proposed metric not only outperforms the existing metrics in terms of caption-level correlation but it also shows a strong system-level correlation against human assessments.
△ Less
Submitted 24 December, 2020;
originally announced December 2020.
-
SubICap: Towards Subword-informed Image Captioning
Authors:
Naeha Sharif,
Mohammed Bennamoun,
Wei Liu,
Syed Afaq Ali Shah
Abstract:
Existing Image Captioning (IC) systems model words as atomic units in captions and are unable to exploit the structural information in the words. This makes representation of rare words very difficult and out-of-vocabulary words impossible. Moreover, to avoid computational complexity, existing IC models operate over a modest sized vocabulary of frequent words, such that the identity of rare words…
▽ More
Existing Image Captioning (IC) systems model words as atomic units in captions and are unable to exploit the structural information in the words. This makes representation of rare words very difficult and out-of-vocabulary words impossible. Moreover, to avoid computational complexity, existing IC models operate over a modest sized vocabulary of frequent words, such that the identity of rare words is lost. In this work we address this common limitation of IC systems in dealing with rare words in the corpora. We decompose words into smaller constituent units 'subwords' and represent captions as a sequence of subwords instead of words. This helps represent all words in the corpora using a significantly lower subword vocabulary, leading to better parameter learning. Using subword language modeling, our captioning system improves various metric scores, with a training vocabulary size approximately 90% less than the baseline and various state-of-the-art word-level models. Our quantitative and qualitative results and analysis signify the efficacy of our proposed approach.
△ Less
Submitted 24 December, 2020;
originally announced December 2020.
-
An Intelligent Non-Invasive Real Time Human Activity Recognition System for Next-Generation Healthcare
Authors:
William Taylor,
Syed Aziz Shah,
Kia Dashtipour,
Adnan Zahid,
Qammer H. Abbasi,
Muhammad Ali Imran
Abstract:
Human motion detection is getting considerable attention in the field of Artificial Intelligence (AI) driven healthcare systems. Human motion can be used to provide remote healthcare solutions for vulnerable people by identifying particular movements such as falls, gait and breathing disorders. This can allow people to live more independent lifestyles and still have the safety of being monitored i…
▽ More
Human motion detection is getting considerable attention in the field of Artificial Intelligence (AI) driven healthcare systems. Human motion can be used to provide remote healthcare solutions for vulnerable people by identifying particular movements such as falls, gait and breathing disorders. This can allow people to live more independent lifestyles and still have the safety of being monitored if more direct care is needed. At present wearable devices can provide real time monitoring by deploying equipment on a person's body. However, putting devices on a person's body all the time make it uncomfortable and the elderly tends to forget it to wear as well in addition to the insecurity of being tracked all the time. This paper demonstrates how human motions can be detected in quasi-real-time scenario using a non-invasive method. Patterns in the wireless signals presents particular human body motions as each movement induces a unique change in the wireless medium. These changes can be used to identify particular body motions. This work produces a dataset that contains patterns of radio wave signals obtained using software defined radios (SDRs) to establish if a subject is standing up or sitting down as a test case. The dataset was used to create a machine learning model, which was used in a developed application to provide a quasi-real-time classification of standing or sitting state. The machine learning model was able to achieve 96.70 % accuracy using the Random Forest algorithm using 10 fold cross validation. A benchmark dataset of wearable devices was compared to the proposed dataset and results showed the proposed dataset to have similar accuracy of nearly 90 %. The machine learning models developed in this paper are tested for two activities but the developed system is designed and applicable for detecting and differentiating x number of activities.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Deep Learning Models for Early Detection and Prediction of the spread of Novel Coronavirus (COVID-19)
Authors:
Devante Ayris,
Kye Horbury,
Blake Williams,
Mitchell Blackney,
Celine Shi Hui See,
Maleeha Imtiaz,
Syed Afaq Ali Shah
Abstract:
SARS-CoV2, which causes coronavirus disease (COVID-19) is continuing to spread globally and has become a pandemic. People have lost their lives due to the virus and the lack of counter measures in place. Given the increasing caseload and uncertainty of spread, there is an urgent need to develop machine learning techniques to predict the spread of COVID-19. Prediction of the spread can allow counte…
▽ More
SARS-CoV2, which causes coronavirus disease (COVID-19) is continuing to spread globally and has become a pandemic. People have lost their lives due to the virus and the lack of counter measures in place. Given the increasing caseload and uncertainty of spread, there is an urgent need to develop machine learning techniques to predict the spread of COVID-19. Prediction of the spread can allow counter measures and actions to be implemented to mitigate the spread of COVID-19. In this paper, we propose a deep learning technique, called Deep Sequential Prediction Model (DSPM) and machine learning based Non-parametric Regression Model (NRM) to predict the spread of COVID-19. Our proposed models were trained and tested on novel coronavirus 2019 dataset, which contains 19.53 Million confirmed cases of COVID-19. Our proposed models were evaluated by using Mean Absolute Error and compared with baseline method. Our experimental results, both quantitative and qualitative, demonstrate the superior prediction performance of the proposed models.
△ Less
Submitted 15 February, 2021; v1 submitted 29 July, 2020;
originally announced August 2020.
-
CommuNety: A Deep Learning System for the Prediction of Cohesive Social Communities
Authors:
Syed Afaq Ali Shah,
Weifeng Deng,
Jianxin Li,
Muhammad Aamir Cheema,
Abdul Bais
Abstract:
Effective mining of social media, which consists of a large number of users is a challenging task. Traditional approaches rely on the analysis of text data related to users to accomplish this task. However, text data lacks significant information about the social users and their associated groups. In this paper, we propose CommuNety, a deep learning system for the prediction of cohesive social net…
▽ More
Effective mining of social media, which consists of a large number of users is a challenging task. Traditional approaches rely on the analysis of text data related to users to accomplish this task. However, text data lacks significant information about the social users and their associated groups. In this paper, we propose CommuNety, a deep learning system for the prediction of cohesive social networks using images. The proposed deep learning model consists of hierarchical CNN architecture to learn descriptive features related to each cohesive network. The paper also proposes a novel Face Co-occurrence Frequency algorithm to quantify existence of people in images, and a novel photo ranking method to analyze the strength of relationship between different individuals in a predicted social network. We extensively evaluate the proposed technique on PIPA dataset and compare with state-of-the-art methods. Our experimental results demonstrate the superior performance of the proposed technique for the prediction of relationship between different individuals and the cohesiveness of communities.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
MeDaS: An open-source platform as service to help break the walls between medicine and informatics
Authors:
Liang Zhang,
Johann Li,
Ping Li,
Xiaoyuan Lu,
Peiyi Shen,
Guangming Zhu,
Syed Afaq Shah,
Mohammed Bennarmoun,
Kun Qian,
Björn W. Schuller
Abstract:
In the past decade, deep learning (DL) has achieved unprecedented success in numerous fields including computer vision, natural language processing, and healthcare. In particular, DL is experiencing an increasing development in applications for advanced medical image analysis in terms of analysis, segmentation, classification, and furthermore. On the one hand, tremendous needs that leverage the po…
▽ More
In the past decade, deep learning (DL) has achieved unprecedented success in numerous fields including computer vision, natural language processing, and healthcare. In particular, DL is experiencing an increasing development in applications for advanced medical image analysis in terms of analysis, segmentation, classification, and furthermore. On the one hand, tremendous needs that leverage the power of DL for medical image analysis are arising from the research community of a medical, clinical, and informatics background to jointly share their expertise, knowledge, skills, and experience. On the other hand, barriers between disciplines are on the road for them often hampering a full and efficient collaboration. To this end, we propose our novel open-source platform, i.e., MeDaS -- the MeDical open-source platform as Service. To the best of our knowledge, MeDaS is the first open-source platform proving a collaborative and interactive service for researchers from a medical background easily using DL related toolkits, and at the same time for scientists or engineers from information sciences to understand the medical knowledge side. Based on a series of toolkits and utilities from the idea of RINV (Rapid Implementation aNd Verification), our proposed MeDaS platform can implement pre-processing, post-processing, augmentation, visualization, and other phases needed in medical image analysis. Five tasks including the subjects of lung, liver, brain, chest, and pathology, are validated and demonstrated to be efficiently realisable by using MeDaS.
△ Less
Submitted 13 July, 2020; v1 submitted 12 July, 2020;
originally announced July 2020.
-
Auto-decoding Graphs
Authors:
Sohil Atul Shah,
Vladlen Koltun
Abstract:
We present an approach to synthesizing new graph structures from empirically specified distributions. The generative model is an auto-decoder that learns to synthesize graphs from latent codes. The graph synthesis model is learned jointly with an empirical distribution over the latent codes. Graphs are synthesized using self-attention modules that are trained to identify likely connectivity patter…
▽ More
We present an approach to synthesizing new graph structures from empirically specified distributions. The generative model is an auto-decoder that learns to synthesize graphs from latent codes. The graph synthesis model is learned jointly with an empirical distribution over the latent codes. Graphs are synthesized using self-attention modules that are trained to identify likely connectivity patterns. Graph-based normalizing flows are used to sample latent codes from the distribution learned by the auto-decoder. The resulting model combines accuracy and scalability. On benchmark datasets of large graphs, the presented model outperforms the state of the art by a factor of 1.5 in mean accuracy and average rank across at least three different graph statistics, with a 2x speedup during inference.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Efficient Scene Text Detection with Textual Attention Tower
Authors:
Liang Zhang,
Yufei Liu,
Hang Xiao,
Lu Yang,
Guangming Zhu,
Syed Afaq Shah,
Mohammed Bennamoun,
Peiyi Shen
Abstract:
Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multioriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false pos…
▽ More
Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multioriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false positive detections. Experiments on public benchmarks including ICDAR 2013, ICDAR 2015 and MSRA-TD500 show that our proposed approach can achieve better or comparable performances with fewer parameters and less computational cost.
△ Less
Submitted 30 January, 2020;
originally announced February 2020.
-
Structure-Feature based Graph Self-adaptive Pooling
Authors:
Liang Zhang,
Xudong Wang,
Hongsheng Li,
Guangming Zhu,
Peiyi Shen,
Ping Li,
Xiaoyuan Lu,
Syed Afaq Ali Shah,
Mohammed Bennamoun
Abstract:
Various methods to deal with graph data have been proposed in recent years. However, most of these methods focus on graph feature aggregation rather than graph pooling. Besides, the existing top-k selection graph pooling methods have a few problems. First, to construct the pooled graph topology, current top-k selection methods evaluate the importance of the node from a single perspective only, whi…
▽ More
Various methods to deal with graph data have been proposed in recent years. However, most of these methods focus on graph feature aggregation rather than graph pooling. Besides, the existing top-k selection graph pooling methods have a few problems. First, to construct the pooled graph topology, current top-k selection methods evaluate the importance of the node from a single perspective only, which is simplistic and unobjective. Second, the feature information of unselected nodes is directly lost during the pooling process, which inevitably leads to a massive loss of graph feature information. To solve these problems mentioned above, we propose a novel graph self-adaptive pooling method with the following objectives: (1) to construct a reasonable pooled graph topology, structure and feature information of the graph are considered simultaneously, which provide additional veracity and objectivity in node selection; and (2) to make the pooled nodes contain sufficiently effective graph information, node feature information is aggregated before discarding the unimportant nodes; thus, the selected nodes contain information from neighbor nodes, which can enhance the use of features of the unselected nodes. Experimental results on four different datasets demonstrate that our method is effective in graph classification and outperforms state-of-the-art graph pooling methods.
△ Less
Submitted 30 January, 2020;
originally announced February 2020.
-
Real Time Surveillance for Low Resolution and Limited-Data Scenarios: An Image Set Classification Approach
Authors:
Uzair Nadeem,
Syed Afaq Ali Shah,
Mohammed Bennamoun,
Roberto Togneri,
Ferdous Sohel
Abstract:
This paper proposes a novel image set classification technique based on the concept of linear regression. Unlike most other approaches, the proposed technique does not involve any training or feature extraction. The gallery image sets are represented as subspaces in a high dimensional space. Class specific gallery subspaces are used to estimate regression models for each image of the test image se…
▽ More
This paper proposes a novel image set classification technique based on the concept of linear regression. Unlike most other approaches, the proposed technique does not involve any training or feature extraction. The gallery image sets are represented as subspaces in a high dimensional space. Class specific gallery subspaces are used to estimate regression models for each image of the test image set. Images of the test set are then projected on the gallery subspaces. Residuals, calculated using the Euclidean distance between the original and the projected test images, are used as the distance metric. Three different strategies are devised to decide on the final class of the test image set. We performed extensive evaluations of the proposed technique under the challenges of low resolution, noise and less gallery data for the tasks of surveillance, video-based face recognition and object recognition. Experiments show that the proposed technique achieves a better classification accuracy and a faster execution time compared to existing techniques especially under the challenging conditions of low resolution and small gallery and test data.
△ Less
Submitted 3 March, 2019; v1 submitted 26 March, 2018;
originally announced March 2018.
-
Deep Continuous Clustering
Authors:
Sohil Atul Shah,
Vladlen Koltun
Abstract:
Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The data is embedded into a lower-dimensional space by a deep autoencoder. The autoencoder is optimized as part of the clustering process. The resulting network produc…
▽ More
Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The data is embedded into a lower-dimensional space by a deep autoencoder. The autoencoder is optimized as part of the clustering process. The resulting network produces clustered data. The presented approach does not rely on prior knowledge of the number of ground-truth clusters. Joint nonlinear dimensionality reduction and clustering are formulated as optimization of a global continuous objective. We thus avoid discrete reconfigurations of the objective that characterize prior clustering algorithms. Experiments on datasets from multiple domains demonstrate that the presented algorithm outperforms state-of-the-art clustering schemes, including recent methods that use deep networks.
△ Less
Submitted 4 March, 2018;
originally announced March 2018.
-
Small Cell Association with Networked Flying Platforms: Novel Algorithms and Performance Bounds
Authors:
Syed Awais Wahab Shah,
Tamer Khattab,
Muhammad Zeeshan Shakir,
Mohammad Galal Khafagy,
Mazen Omar Hasna
Abstract:
Fifth generation (5G) and beyond-5G (B5G) systems expect coverage and capacity enhancements along with the consideration of limited power, cost and spectrum. Densification of small cells (SCs) is a promising approach to cater these demands of 5G and B5G systems. However, such an ultra dense network of SCs requires provision of smart backhaul and fronthaul networks. In this paper, we employ a scala…
▽ More
Fifth generation (5G) and beyond-5G (B5G) systems expect coverage and capacity enhancements along with the consideration of limited power, cost and spectrum. Densification of small cells (SCs) is a promising approach to cater these demands of 5G and B5G systems. However, such an ultra dense network of SCs requires provision of smart backhaul and fronthaul networks. In this paper, we employ a scalable idea of using networked flying platforms (NFPs) as aerial hubs to provide fronthaul connectivity to the SCs. We consider the association problem of SCs and NFPs in a SC network and study the effect of practical constraints related to the system and NFPs. Mainly, we show that the association problem is related to the generalized assignment problem (GAP). Using this relation with the GAP, we show the NP-hard complexity of the association problem and further derive an upper bound for the maximum achievable sum data rate. Linear Programming relaxation of the problem is also studied to compare the results with the derived bounds. Finally, two efficient (less complex) greedy solutions of the association problem are presented, where one of them is a distributed solution and the other one is its centralized version. Numerical results show a favorable performance of the presented algorithms with respect to the exhaustive search and derived bounds. The computational complexity comparison of the algorithms with the exhaustive search is also presented to show that the presented algorithms can be practically implemented.
△ Less
Submitted 4 February, 2018;
originally announced February 2018.
-
Performance Comparison of Intrusion Detection Systems and Application of Machine Learning to Snort System
Authors:
Syed Ali Raza Shah,
Biju Issac
Abstract:
This study investigates the performance of two open source intrusion detection systems (IDSs) namely Snort and Suricata for accurately detecting the malicious traffic on computer networks. Snort and Suricata were installed on two different but identical computers and the performance was evaluated at 10 Gbps network speed. It was noted that Suricata could process a higher speed of network traffic t…
▽ More
This study investigates the performance of two open source intrusion detection systems (IDSs) namely Snort and Suricata for accurately detecting the malicious traffic on computer networks. Snort and Suricata were installed on two different but identical computers and the performance was evaluated at 10 Gbps network speed. It was noted that Suricata could process a higher speed of network traffic than Snort with lower packet drop rate but it consumed higher computational resources. Snort had higher detection accuracy and was thus selected for further experiments. It was observed that the Snort triggered a high rate of false positive alarms. To solve this problem a Snort adaptive plug-in was developed. To select the best performing algorithm for Snort adaptive plug-in, an empirical study was carried out with different learning algorithms and Support Vector Machine (SVM) was selected. A hybrid version of SVM and Fuzzy logic produced a better detection accuracy. But the best result was achieved using an optimised SVM with firefly algorithm with FPR (false positive rate) as 8.6% and FNR (false negative rate) as 2.2%, which is a good result. The novelty of this work is the performance comparison of two IDSs at 10 Gbps and the application of hybrid and optimised machine learning algorithms to Snort.
△ Less
Submitted 7 November, 2017; v1 submitted 13 October, 2017;
originally announced October 2017.
-
Association of Networked Flying Platforms with Small Cells for Network Centric 5G+ C-RAN
Authors:
Syed Awais Wahab Shah,
Tamer Khattab,
Muhammad Zeeshan Shakir,
Mazen Omar Hasna
Abstract:
5G+ systems expect enhancement in data rate and coverage area under limited power constraint. Such requirements can be fulfilled by the densification of small cells (SCs). However, a major challenge is the management of fronthaul links connected with an ultra dense network of SCs. A cost effective and scalable idea of using network flying platforms (NFPs) is employed here, where the NFPs are used…
▽ More
5G+ systems expect enhancement in data rate and coverage area under limited power constraint. Such requirements can be fulfilled by the densification of small cells (SCs). However, a major challenge is the management of fronthaul links connected with an ultra dense network of SCs. A cost effective and scalable idea of using network flying platforms (NFPs) is employed here, where the NFPs are used as fronthaul hubs that connect the SCs to the core network. The association problem of NFPs and SCs is formulated considering a number of practical constraints such as backhaul data rate limit, maximum supported links and bandwidth by NFPs and quality of service requirement of the system. The network centric case of the system is considered that aims to maximize the number of associated SCs without any biasing, i.e., no preference for high priority SCs. Then, two new efficient greedy algorithms are designed to solve the presented association problem. Numerical results show a favorable performance of our proposed methods in comparison to exhaustive search.
△ Less
Submitted 11 July, 2017;
originally announced July 2017.
-
A Distributed Approach for Networked Flying Platform Association with Small Cells in 5G+ Networks
Authors:
Syed Awais Wahab Shah,
Tamer Khattab,
Muhammad Zeeshan Shakir,
Mazen Omar Hasna
Abstract:
The densification of small-cell base stations in a 5G architecture is a promising approach to enhance the coverage area and facilitate the ever increasing capacity demand of end users. However, the bottleneck is an intelligent management of a backhaul/fronthaul network for these small-cell base stations. This involves efficient association and placement of the backhaul hubs that connects these sma…
▽ More
The densification of small-cell base stations in a 5G architecture is a promising approach to enhance the coverage area and facilitate the ever increasing capacity demand of end users. However, the bottleneck is an intelligent management of a backhaul/fronthaul network for these small-cell base stations. This involves efficient association and placement of the backhaul hubs that connects these small-cells with the core network. Terrestrial hubs suffer from an inefficient non line of sight link limitations and unavailability of a proper infrastructure in an urban area. Seeing the popularity of flying platforms, we employ here an idea of using networked flying platform (NFP) such as unmanned aerial vehicles (UAVs), drones, unmanned balloons flying at different altitudes, as aerial backhaul hubs. The association problem of these NFP-hubs and small-cell base stations is formulated considering backhaul link and NFP related limitations such as maximum number of supported links and bandwidth. Then, this paper presents an efficient and distributed solution of the designed problem, which performs a greedy search in order to maximize the sum rate of the overall network. A favorable performance is observed via a numerical comparison of our proposed method with optimal exhaustive search algorithm in terms of sum rate and run-time speed.
△ Less
Submitted 21 April, 2017;
originally announced May 2017.
-
Efficient Image Set Classification using Linear Regression based Image Reconstruction
Authors:
Syed Afaq Ali Shah,
Uzair Nadeem,
Mohammed Bennamoun,
Ferdous Sohel,
Roberto Togneri
Abstract:
We propose a novel image set classification technique using linear regression models. Downsampled gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step. We estimate regression models for each test image using the class specific gallery subspaces. Images of the test set are then reconstructed using the regression models. Bas…
▽ More
We propose a novel image set classification technique using linear regression models. Downsampled gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step. We estimate regression models for each test image using the class specific gallery subspaces. Images of the test set are then reconstructed using the regression models. Based on the minimum reconstruction error between the reconstructed and the original images, a weighted voting strategy is used to classify the test set. We performed extensive evaluation on the benchmark UCSD/Honda, CMU Mobo and YouTube Celebrity datasets for face classification, and ETH-80 dataset for object classification. The results demonstrate that by using only a small amount of training data, our technique achieved competitive classification accuracy and superior computational speed compared with the state-of-the-art methods.
△ Less
Submitted 10 January, 2017;
originally announced January 2017.
-
Adaptive Beaconing Approaches for Vehicular ad hoc Networks: A Survey
Authors:
Syed Adeel Ali Shah,
Ejaz Ahmed,
Feng Xia,
Ahmad Karim,
Muhammad Shiraz,
Rafidah MD Noor
Abstract:
Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope…
▽ More
Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope with application requirements. To this end, this paper is motivated by the classifications and capability evaluations of existing adaptive beaconing approaches. To begin with, we explore the anatomy and the performance requirements of beaconing. Then, the beaconing design is analyzed to introduce a design-based beaconing taxonomy. A survey of the state-of-the-art is conducted with an emphasis on the salient features of the beaconing approaches. We also evaluate the capabilities of beaconing approaches using several key parameters. A comparison among beaconing approaches is presented, which is based on the architectural and implementation characteristics. The paper concludes by discussing open challenges in the field.
△ Less
Submitted 24 May, 2016;
originally announced May 2016.
-
Blind Source Separation Algorithms Using Hyperbolic and Givens Rotations for High-Order QAM Constellations
Authors:
Syed A. W. Shah,
Karim Abed-Meraim,
Tareq Y. Al-Naffouri
Abstract:
This paper addresses the problem of blind demixing of instantaneous mixtures in a multiple-input multiple-output communication system. The main objective is to present efficient blind source separation (BSS) algorithms dedicated to moderate or high-order QAM constellations. Four new iterative batch BSS algorithms are presented dealing with the multimodulus (MM) and alphabet matched (AM) criteria.…
▽ More
This paper addresses the problem of blind demixing of instantaneous mixtures in a multiple-input multiple-output communication system. The main objective is to present efficient blind source separation (BSS) algorithms dedicated to moderate or high-order QAM constellations. Four new iterative batch BSS algorithms are presented dealing with the multimodulus (MM) and alphabet matched (AM) criteria. For the optimization of these cost functions, iterative methods of Givens and hyperbolic rotations are used. A pre-whitening operation is also utilized to reduce the complexity of design problem. It is noticed that the designed algorithms using Givens rotations gives satisfactory performance only for large number of samples. However, for small number of samples, the algorithms designed by combining both Givens and hyperbolic rotations compensate for the ill-whitening that occurs in this case and thus improves the performance. Two algorithms dealing with the MM criterion are presented for moderate order QAM signals such as 16-QAM. The other two dealing with the AM criterion are presented for high-order QAM signals. These methods are finally compared with the state of art batch BSS algorithms in terms of signal-to-interference and noise ratio, symbol error rate and convergence rate. Simulation results show that the proposed methods outperform the contemporary batch BSS algorithms.
△ Less
Submitted 29 June, 2016; v1 submitted 22 June, 2015;
originally announced June 2015.
-
LPCH and UDLPCH: Location-aware Routing Techniques in WSNs
Authors:
Y. Khan,
N. Javaid,
M. J. Khan,
Y. Ahmad,
M. H. Zubair,
S. A. Shah
Abstract:
Wireless sensor nodes along with Base Station (BS) constitute a Wireless Sensor Network (WSN). Nodes comprise of tiny power battery. Nodes sense the data and send it to BS. WSNs need protocol for efficient energy consumption of the network. In direct transmission and minimum transmission energy routing protocols, energy consumption is not well distributed. However, LEACH (Low-Energy Adaptive Clust…
▽ More
Wireless sensor nodes along with Base Station (BS) constitute a Wireless Sensor Network (WSN). Nodes comprise of tiny power battery. Nodes sense the data and send it to BS. WSNs need protocol for efficient energy consumption of the network. In direct transmission and minimum transmission energy routing protocols, energy consumption is not well distributed. However, LEACH (Low-Energy Adaptive Clustering Hierarchy) is a clustering protocol; randomly selects the Cluster Heads (CHs) in each round. However, random selection of CHs does not guarantee efficient energy consumption of the network. Therefore, we proposed new clustering techniques in routing protocols, Location-aware Permanent CH (LPCH) and User Defined Location-aware Permanent CH (UDLPCH). In both protocols, network field is physically divided in to two regions, equal number of nodes are randomly deployed in each region. In LPCH, number of CHs are selected by LEACH algorithm in first round. However in UDLPCH, equal and optimum number of CHs are selected in each region, throughout the network life time number of CHs are remain same. Simulation results show that stability period and throughput of LPCH is greater than LEACH, stability period and throughput of UDLPCH is greater than LPCH.
△ Less
Submitted 26 July, 2013;
originally announced July 2013.