Search | arXiv e-print repository

DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

Authors: Qi Wang, Zhou Xu, Yuming Lin, Jingtao Ye, Hongsheng Li, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang

Abstract: Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based cameras. The distinctive capabilities of event cameras have ignited significant interest in the domain of event-based action recognition, recognizing thei… ▽ More Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based cameras. The distinctive capabilities of event cameras have ignited significant interest in the domain of event-based action recognition, recognizing their vast potential for advancement. However, the development in this field is currently slowed by the lack of comprehensive, large-scale datasets, which are critical for developing robust recognition frameworks. To bridge this gap, we introduces DailyDVS-200, a meticulously curated benchmark dataset tailored for the event-based action recognition community. DailyDVS-200 is extensive, covering 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences. This dataset is designed to reflect a broad spectrum of action types, scene complexities, and data acquisition diversity. Each sequence in the dataset is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions. Moreover, DailyDVS-200 is structured to facilitate a wide range of research paths, offering a solid foundation for both validating existing approaches and inspiring novel methodologies. By setting a new benchmark in the field, we challenge the current limitations of neuromorphic data processing and invite a surge of new approaches in event-based action recognition techniques, which paves the way for future explorations in neuromorphic computing and beyond. The dataset and source code are available at https://github.com/QiWang233/DailyDVS-200. △ Less

Submitted 13 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV 2024

arXiv:2406.14237 [pdf, other]

Finite Alphabet Fast List Decoders for Polar Codes

Authors: Syed Aizaz Ali Shah, Gerhard Bauch

Abstract: The so-called fast polar decoding schedules are meant to improve the decoding speed of the sequential-natured successive cancellation list decoders. The decoding speedup is achieved by replacing various parts of the serial decoding process with efficient special-purpose decoder nodes. This work incorporates the fast decoding schedules for polar codes into their quantized finite alphabet decoding.… ▽ More The so-called fast polar decoding schedules are meant to improve the decoding speed of the sequential-natured successive cancellation list decoders. The decoding speedup is achieved by replacing various parts of the serial decoding process with efficient special-purpose decoder nodes. This work incorporates the fast decoding schedules for polar codes into their quantized finite alphabet decoding. In a finite alphabet successive cancellation list decoder, the log-likelihood ratio computations are replaced with lookup operations on low-resolution integer messages. The lookup tables are designed using the information bottleneck method. It is shown that the finite alphabet decoders can also leverage the special decoder nodes found in the literature. Besides their inherent decoding speed improvement, the use of these special decoder nodes drastically reduces the number of lookup tables required to perform the finite alphabet decoding. In order to perform quantized decoding using lookup operations, the proposed decoders require up to 93% less unique lookup tables as compared to the ones that use the conventional successive cancellation schedule. Moreover, the proposed decoders exhibit negligible loss in error correction performance without necessitating alterations to the lookup table design process. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 6 pages, 7 figures, submitted to IEEE GLOBECOM 2024

arXiv:2404.01591 [pdf, other]

Language Model Guided Interpretable Video Action Reasoning

Authors: Ning Wang, Guangming Zhu, HS Li, Liang Zhang, Syed Afaq Ali Shah, Mohammed Bennamoun

Abstract: While neural networks have excelled in video action recognition tasks, their black-box nature often obscures the understanding of their decision-making processes. Recent approaches used inherently interpretable models to analyze video actions in a manner akin to human reasoning. These models, however, usually fall short in performance compared to their black-box counterparts. In this work, we pres… ▽ More While neural networks have excelled in video action recognition tasks, their black-box nature often obscures the understanding of their decision-making processes. Recent approaches used inherently interpretable models to analyze video actions in a manner akin to human reasoning. These models, however, usually fall short in performance compared to their black-box counterparts. In this work, we present a new framework named Language-guided Interpretable Action Recognition framework (LaIAR). LaIAR leverages knowledge from language models to enhance both the recognition capabilities and the interpretability of video models. In essence, we redefine the problem of understanding video model decisions as a task of aligning video and language models. Using the logical reasoning captured by the language model, we steer the training of the video model. This integrated approach not only improves the video model's adaptability to different domains but also boosts its overall performance. Extensive experiments on two complex video action datasets, Charades & CAD-120, validates the improved performance and interpretability of our LaIAR framework. The code of LaIAR is available at https://github.com/NingWang2049/LaIAR. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: Accepted by CVPR 2024

arXiv:2401.03742 [pdf, other]

Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach

Authors: Huanyu Liu, Jianfeng Cai, Tingjia Zhang, Hongsheng Li, Siyuan Wang, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang

Abstract: Flowcharts and mind maps, collectively known as flowmind, are vital in daily activities, with hand-drawn versions facilitating real-time collaboration. However, there's a growing need to digitize them for efficient processing. Automated conversion methods are essential to overcome manual conversion challenges. Existing sketch recognition methods face limitations in practical situations, being fiel… ▽ More Flowcharts and mind maps, collectively known as flowmind, are vital in daily activities, with hand-drawn versions facilitating real-time collaboration. However, there's a growing need to digitize them for efficient processing. Automated conversion methods are essential to overcome manual conversion challenges. Existing sketch recognition methods face limitations in practical situations, being field-specific and lacking digital conversion steps. Our paper introduces the Flowmind2digital method and hdFlowmind dataset to address these challenges. Flowmind2digital, utilizing neural networks and keypoint detection, achieves a record 87.3% accuracy on our dataset, surpassing previous methods by 11.9%. The hdFlowmind dataset, comprising 1,776 annotated flowminds across 22 scenarios, outperforms existing datasets. Additionally, our experiments emphasize the importance of simple graphics, enhancing accuracy by 9.3%. △ Less

Submitted 8 January, 2024; originally announced January 2024.

arXiv:2310.13263 [pdf, other]

UE4-NeRF:Neural Radiance Field for Real-Time Rendering of Large-Scale Scene

Authors: Jiaming Gu, Minchao Jiang, Hongsheng Li, Xiaoyuan Lu, Guangming Zhu, Syed Afaq Ali Shah, Liang Zhang, Mohammed Bennamoun

Abstract: Neural Radiance Fields (NeRF) is a novel implicit 3D reconstruction method that shows immense potential and has been gaining increasing attention. It enables the reconstruction of 3D scenes solely from a set of photographs. However, its real-time rendering capability, especially for interactive real-time rendering of large-scale scenes, still has significant limitations. To address these challenge… ▽ More Neural Radiance Fields (NeRF) is a novel implicit 3D reconstruction method that shows immense potential and has been gaining increasing attention. It enables the reconstruction of 3D scenes solely from a set of photographs. However, its real-time rendering capability, especially for interactive real-time rendering of large-scale scenes, still has significant limitations. To address these challenges, in this paper, we propose a novel neural rendering system called UE4-NeRF, specifically designed for real-time rendering of large-scale scenes. We partitioned each large scene into different sub-NeRFs. In order to represent the partitioned independent scene, we initialize polygonal meshes by constructing multiple regular octahedra within the scene and the vertices of the polygonal faces are continuously optimized during the training process. Drawing inspiration from Level of Detail (LOD) techniques, we trained meshes of varying levels of detail for different observation levels. Our approach combines with the rasterization pipeline in Unreal Engine 4 (UE4), achieving real-time rendering of large-scale scenes at 4K resolution with a frame rate of up to 43 FPS. Rendering within UE4 also facilitates scene editing in subsequent stages. Furthermore, through experiments, we have demonstrated that our method achieves rendering quality comparable to state-of-the-art approaches. Project page: https://jamchaos.github.io/UE4-NeRF/. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: Accepted by NeurIPS2023

arXiv:2310.00010 [pdf]

Artificial Empathy Classification: A Survey of Deep Learning Techniques, Datasets, and Evaluation Scales

Authors: Sharjeel Tahir, Syed Afaq Shah, Jumana Abu-Khalaf

Abstract: From the last decade, researchers in the field of machine learning (ML) and assistive developmental robotics (ADR) have taken an interest in artificial empathy (AE) as a possible future paradigm for human-robot interaction (HRI). Humans learn empathy since birth, therefore, it is challenging to instill this sense in robots and intelligent machines. Nevertheless, by training over a vast amount of d… ▽ More From the last decade, researchers in the field of machine learning (ML) and assistive developmental robotics (ADR) have taken an interest in artificial empathy (AE) as a possible future paradigm for human-robot interaction (HRI). Humans learn empathy since birth, therefore, it is challenging to instill this sense in robots and intelligent machines. Nevertheless, by training over a vast amount of data and time, imitating empathy, to a certain extent, can be possible for robots. Training techniques for AE, along with findings from the field of empathetic AI research, are ever-evolving. The standard workflow for artificial empathy consists of three stages: 1) Emotion Recognition (ER) using the retrieved features from video or textual data, 2) analyzing the perceived emotion or degree of empathy to choose the best course of action, and 3) carrying out a response action. Recent studies that show AE being used with virtual agents or robots often include Deep Learning (DL) techniques. For instance, models like VGGFace are used to conduct ER. Semi-supervised models like Autoencoders generate the corresponding emotional states and behavioral responses. However, there has not been any study that presents an independent approach for evaluating AE, or the degree to which a reaction was empathetic. This paper aims to investigate and evaluate existing works for measuring and evaluating empathy, as well as the datasets that have been collected and used so far. Our goal is to highlight and facilitate the use of state-of-the-art methods in the area of AE by comparing their performance. This will aid researchers in the area of AE in selecting their approaches with precision. △ Less

Submitted 4 September, 2023; originally announced October 2023.

MSC Class: 68T40

arXiv:2309.03231 [pdf]

Quantum-AI empowered Intelligent Surveillance: Advancing Public Safety Through Innovative Contraband Detection

Authors: Syed Atif Ali Shah, Nasir Algeelani, Najeeb Al-Sammarraie

Abstract: Surveillance systems have emerged as crucial elements in upholding peace and security in the modern world. Their ubiquity aids in monitoring suspicious activities effectively. However, in densely populated environments, continuous active monitoring becomes impractical, necessitating the development of intelligent surveillance systems. AI integration in the surveillance domain was a big revolution,… ▽ More Surveillance systems have emerged as crucial elements in upholding peace and security in the modern world. Their ubiquity aids in monitoring suspicious activities effectively. However, in densely populated environments, continuous active monitoring becomes impractical, necessitating the development of intelligent surveillance systems. AI integration in the surveillance domain was a big revolution, however, speed issues have prevented its widespread implementation in the field. It has been observed that quantum artificial intelligence has led to a great breakthrough. Quantum artificial intelligence-based surveillance systems have shown to be more accurate as well as capable of performing well in real-time scenarios, which had never been seen before. In this research, a RentinaNet model is integrated with Quantum CNN and termed as Quantum-RetinaNet. By harnessing the Quantum capabilities of QCNN, Quantum-RetinaNet strikes a balance between accuracy and speed. This innovative integration positions it as a game-changer, addressing the challenges of active monitoring in densely populated scenarios. As demand for efficient surveillance solutions continues to grow, Quantum-RetinaNet offers a compelling alternative to existing CNN models, upholding accuracy standards without sacrificing real-time performance. The unique attributes of Quantum-RetinaNet have far-reaching implications for the future of intelligent surveillance. With its enhanced processing speed, it is poised to revolutionize the field, catering to the pressing need for rapid yet precise monitoring. As Quantum-RetinaNet becomes the new standard, it ensures public safety and security while pushing the boundaries of AI in surveillance. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2308.02394 [pdf, ps, other]

Unrolled and Pipelined Decoders based on Look-Up Tables for Polar Codes

Authors: Pascal Giard, Syed Aizaz Ali Shah, Alexios Balatsoukas-Stimming, Maximilian Stark, Gerhard Bauch

Abstract: Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throug… ▽ More Unrolling a decoding algorithm allows to achieve extremely high throughput at the cost of increased area. Look-up tables (LUTs) can be used to replace functions otherwise implemented as circuits. In this work, we show the impact of replacing blocks of logic by carefully crafted LUTs in unrolled decoders for polar codes. We show that using LUTs to improve key performance metrics (e.g., area, throughput, latency) may turn out more challenging than expected. We present three variants of LUT-based decoders and describe their inner workings as well as circuits in detail. The LUT-based decoders are compared against a regular unrolled decoder, employing fixed-point representations for numbers, with a comparable error-correction performance. A short systematic polar code is used as an illustration. All resulting unrolled decoders are shown to be capable of an information throughput of little under 10 Gbps in a 28 nm FD-SOI technology clocked in the vicinity of 1.4 GHz to 1.5 GHz. The best variant of our LUT-based decoders is shown to reduce the area requirements by 23% compared to the regular unrolled decoder while retaining a comparable error-correction performance. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: Accepted to the International Symposium on Topics in Coding (ISTC) 2023

arXiv:2305.16950 [pdf, ps, other]

Implementation-Efficient Finite Alphabet Decoding of Polar Codes

Authors: Philipp Mohr, Syed Aizaz Ali Shah, Gerhard Bauch

Abstract: An implementation-efficient finite alphabet decoder for polar codes relying on coarsely quantized messages and low-complexity operations is proposed. Typically, finite alphabet decoding performs concatenated compression operations on the received channel messages to aggregate compact reliability information for error correction. These compression operations or mappings can be considered as lookup… ▽ More An implementation-efficient finite alphabet decoder for polar codes relying on coarsely quantized messages and low-complexity operations is proposed. Typically, finite alphabet decoding performs concatenated compression operations on the received channel messages to aggregate compact reliability information for error correction. These compression operations or mappings can be considered as lookup tables. For polar codes, the finite alphabet decoder design boils down to constructing lookup tables for the upper and lower branches of the building blocks within the code structure. A key challenge is to realize a hardware-friendly implementation of the lookup tables. This work uses the min-sum implementation for the upper branch lookup table and, as a novelty, a computational domain implementation for the lower branch lookup table. The computational domain approach drastically reduces the number of implementation parameters. Furthermore, a restriction to uniform quantization in the lower branch allows a very hardware-friendly compression via clipping and bit-shifting. Its behavior is close to the optimal non-uniform quantization, whose implementation would require multiple high-resolution threshold comparisons. Simulation results confirm excellent performance for the developed decoder. Unlike conventional fixed-point decoders, the proposed method involves an offline design that explicitly maximizes the preserved mutual information under coarse quantization. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: This work has been submitted to IEEE GLOBECOM 2023 and is currently under review

arXiv:2304.09756 [pdf, other]

Contactless Human Activity Recognition using Deep Learning with Flexible and Scalable Software Define Radio

Authors: Muhammad Zakir Khan, Jawad Ahmad, Wadii Boulila, Matthew Broadbent, Syed Aziz Shah, Anis Koubaa, Qammer H. Abbasi

Abstract: Ambient computing is gaining popularity as a major technological advancement for the future. The modern era has witnessed a surge in the advancement in healthcare systems, with viable radio frequency solutions proposed for remote and unobtrusive human activity recognition (HAR). Specifically, this study investigates the use of Wi-Fi channel state information (CSI) as a novel method of ambient sens… ▽ More Ambient computing is gaining popularity as a major technological advancement for the future. The modern era has witnessed a surge in the advancement in healthcare systems, with viable radio frequency solutions proposed for remote and unobtrusive human activity recognition (HAR). Specifically, this study investigates the use of Wi-Fi channel state information (CSI) as a novel method of ambient sensing that can be employed as a contactless means of recognizing human activity in indoor environments. These methods avoid additional costly hardware required for vision-based systems, which are privacy-intrusive, by (re)using Wi-Fi CSI for various safety and security applications. During an experiment utilizing universal software-defined radio (USRP) to collect CSI samples, it was observed that a subject engaged in six distinct activities, which included no activity, standing, sitting, and leaning forward, across different areas of the room. Additionally, more CSI samples were collected when the subject walked in two different directions. This study presents a Wi-Fi CSI-based HAR system that assesses and contrasts deep learning approaches, namely convolutional neural network (CNN), long short-term memory (LSTM), and hybrid (LSTM+CNN), employed for accurate activity recognition. The experimental results indicate that LSTM surpasses current models and achieves an average accuracy of 95.3% in multi-activity classification when compared to CNN and hybrid techniques. In the future, research needs to study the significance of resilience in diverse and dynamic environments to identify the activity of multiple users. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2206.02358 [pdf, other]

doi 10.1109/TCSII.2022.3181132

Implementation of a Modified U-Net for Medical Image Segmentation on Edge Devices

Authors: Owais Ali, Hazrat Ali, Syed Ayaz Ali Shah, Aamir Shahzad

Abstract: Deep learning techniques, particularly convolutional neural networks, have shown great potential in computer vision and medical imaging applications. However, deep learning models are computationally demanding as they require enormous computational power and specialized processing hardware for model training. To make these models portable and compatible for prototyping, their implementation on low… ▽ More Deep learning techniques, particularly convolutional neural networks, have shown great potential in computer vision and medical imaging applications. However, deep learning models are computationally demanding as they require enormous computational power and specialized processing hardware for model training. To make these models portable and compatible for prototyping, their implementation on low-power devices is imperative. In this work, we present the implementation of Modified U-Net on Intel Movidius Neural Compute Stick 2 (NCS-2) for the segmentation of medical images. We selected U-Net because, in medical image segmentation, U-Net is a prominent model that provides improved performance for medical image segmentation even if the dataset size is small. The modified U-Net model is evaluated for performance in terms of dice score. Experiments are reported for segmentation task on three medical imaging datasets: BraTs dataset of brain MRI, heart MRI dataset, and Ziehl-Neelsen sputum smear microscopy image (ZNSDB) dataset. For the proposed model, we reduced the number of parameters from 30 million in the U-Net model to 0.49 million in the proposed architecture. Experimental results show that the modified U-Net provides comparable performance while requiring significantly lower resources and provides inference on the NCS-2. The maximum dice scores recorded are 0.96 for the BraTs dataset, 0.94 for the heart MRI dataset, and 0.74 for the ZNSDB dataset. △ Less

Submitted 6 June, 2022; originally announced June 2022.

Comments: Preprint of paper accepted in IEEE Transactions on Circuits and Systems II: Express Brief

arXiv:2203.14092 [pdf, other]

A large scale multi-view RGBD visual affordance learning dataset

Authors: Zeyad Khalifa, Syed Afaq Ali Shah

Abstract: The physical and textural attributes of objects have been widely studied for recognition, detection and segmentation tasks in computer vision.~A number of datasets, such as large scale ImageNet, have been proposed for feature learning using data hungry deep neural networks and for hand-crafted feature extraction. To intelligently interact with objects, robots and intelligent machines need the abil… ▽ More The physical and textural attributes of objects have been widely studied for recognition, detection and segmentation tasks in computer vision.~A number of datasets, such as large scale ImageNet, have been proposed for feature learning using data hungry deep neural networks and for hand-crafted feature extraction. To intelligently interact with objects, robots and intelligent machines need the ability to infer beyond the traditional physical/textural attributes, and understand/learn visual cues, called visual affordances, for affordance recognition, detection and segmentation. To date there is no publicly available large dataset for visual affordance understanding and learning. In this paper, we introduce a large scale multi-view RGBD visual affordance learning dataset, a benchmark of 47210 RGBD images from 37 object categories, annotated with 15 visual affordance categories. To the best of our knowledge, this is the first ever and the largest multi-view RGBD visual affordance learning dataset. We benchmark the proposed dataset for affordance segmentation and recognition tasks using popular Vision Transformer and Convolutional Neural Networks. Several state-of-the-art deep learning networks are evaluated each for affordance recognition and segmentation tasks. Our experimental results showcase the challenging nature of the dataset and present definite prospects for new and robust affordance learning algorithms. The dataset is publicly available at https://sites.google.com/view/afaqshah/dataset. △ Less

Submitted 12 September, 2023; v1 submitted 26 March, 2022; originally announced March 2022.

arXiv:2201.00443 [pdf, other]

Scene Graph Generation: A Comprehensive Survey

Authors: Guangming Zhu, Liang Zhang, Youliang Jiang, Yixuan Dang, Haoran Hou, Peiyi Shen, Mingtao Feng, Xia Zhao, Qiguang Miao, Syed Afaq Ali Shah, Mohammed Bennamoun

Abstract: Deep learning techniques have led to remarkable breakthroughs in the field of generic object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semanti… ▽ More Deep learning techniques have led to remarkable breakthroughs in the field of generic object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semantic structural scene graph, which requires the correct labeling of detected objects and their relationships. Although this is a challenging task, the community has proposed a lot of SGG approaches and achieved good results. In this paper, we provide a comprehensive survey of recent achievements in this field brought about by deep learning techniques. We review 138 representative works that cover different input modalities, and systematically summarize existing methods of image-based SGG from the perspective of feature extraction and fusion. We attempt to connect and systematize the existing visual relationship detection methods, to summarize, and interpret the mechanisms and the strategies of SGG in a comprehensive way. Finally, we finish this survey with deep discussions about current existing problems and future research directions. This survey will help readers to develop a better understanding of the current research status and ideas. △ Less

Submitted 22 June, 2022; v1 submitted 2 January, 2022; originally announced January 2022.

Comments: Submitted to TPAMI

arXiv:2108.10217 [pdf, other]

Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks

Authors: Nima Mirnateghi, Syed Afaq Ali Shah, Mohammed Bennamoun

Abstract: Deep learning has become an integral part of various computer vision systems in recent years due to its outstanding achievements for object recognition, facial recognition, and scene understanding. However, deep neural networks (DNNs) are susceptible to be fooled with nearly high confidence by an adversary. In practice, the vulnerability of deep learning systems against carefully perturbed images,… ▽ More Deep learning has become an integral part of various computer vision systems in recent years due to its outstanding achievements for object recognition, facial recognition, and scene understanding. However, deep neural networks (DNNs) are susceptible to be fooled with nearly high confidence by an adversary. In practice, the vulnerability of deep learning systems against carefully perturbed images, known as adversarial examples, poses a dire security threat in the physical world applications. To address this phenomenon, we present, what to our knowledge, is the first ever image set based adversarial defence approach. Image set classification has shown an exceptional performance for object and face recognition, owing to its intrinsic property of handling appearance variability. We propose a robust deep Bayesian image set classification as a defence framework against a broad range of adversarial attacks. We extensively experiment the performance of the proposed technique with several voting strategies. We further analyse the effects of image size, perturbation magnitude, along with the ratio of perturbed images in each image set. We also evaluate our technique with the recent state-of-the-art defence methods, and single-shot recognition task. The empirical results demonstrate superior performance on CIFAR-10, MNIST, ETH-80, and Tiny ImageNet datasets. △ Less

Submitted 23 August, 2021; originally announced August 2021.

arXiv:2106.12864 [pdf, other]

A Systematic Collection of Medical Image Datasets for Deep Learning

Authors: Johann Li, Guangming Zhu, Cong Hua, Mingtao Feng, BasheerBennamoun, Ping Li, Xiaoyuan Lu, Juan Song, Peiyi Shen, Xu Xu, Lin Mei, Liang Zhang, Syed Afaq Ali Shah, Mohammed Bennamoun

Abstract: The astounding success made by artificial intelligence (AI) in healthcare and other fields proves that AI can achieve human-like performance. However, success always comes with challenges. Deep learning algorithms are data-dependent and require large datasets for training. The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analy… ▽ More The astounding success made by artificial intelligence (AI) in healthcare and other fields proves that AI can achieve human-like performance. However, success always comes with challenges. Deep learning algorithms are data-dependent and require large datasets for training. The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analysis. Medical image acquisition, annotation, and analysis are costly, and their usage is constrained by ethical restrictions. They also require many resources, such as human expertise and funding. That makes it difficult for non-medical researchers to have access to useful and large medical data. Thus, as comprehensive as possible, this paper provides a collection of medical image datasets with their associated challenges for deep learning research. We have collected information of around three hundred datasets and challenges mainly reported between 2013 and 2020 and categorized them into four categories: head & neck, chest & abdomen, pathology & blood, and ``others''. Our paper has three purposes: 1) to provide a most up to date and complete list that can be used as a universal reference to easily find the datasets for clinical image analysis, 2) to guide researchers on the methodology to test and evaluate their methods' performance and robustness on relevant datasets, 3) to provide a ``route'' to relevant algorithms for the relevant medical topics, and challenge leaderboards. △ Less

Submitted 24 June, 2021; originally announced June 2021.

Comments: This paper has been submitted to one journal

arXiv:2012.13137 [pdf, other]

WEmbSim: A Simple yet Effective Metric for Image Captioning

Authors: Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah

Abstract: The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings(MOWE) of captions can actually achieve a… ▽ More The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings(MOWE) of captions can actually achieve a surprisingly high performance on unsupervised caption evaluation. This inspires our proposed work on an effective metric WEmbSim, which beats complex measures such as SPICE, CIDEr and WMD at system-level correlation with human judgments. Moreover, it also achieves the best accuracy at matching human consensus scores for caption pairs, against commonly used unsupervised methods. Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified. △ Less

Submitted 24 December, 2020; originally announced December 2020.

Comments: 7 pages

Journal ref: International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2020

arXiv:2012.13136 [pdf, other]

doi 10.1007/s11263-019-01206-z

LCEval: Learned Composite Metric for Caption Evaluation

Authors: Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah

Abstract: Automatic evaluation metrics hold a fundamental importance in the development and fine-grained analysis of captioning systems. While current evaluation metrics tend to achieve an acceptable correlation with human judgements at the system level, they fail to do so at the caption level. In this work, we propose a neural network-based learned metric to improve the caption-level caption evaluation. To… ▽ More Automatic evaluation metrics hold a fundamental importance in the development and fine-grained analysis of captioning systems. While current evaluation metrics tend to achieve an acceptable correlation with human judgements at the system level, they fail to do so at the caption level. In this work, we propose a neural network-based learned metric to improve the caption-level caption evaluation. To get a deeper insight into the parameters which impact a learned metrics performance, this paper investigates the relationship between different linguistic features and the caption-level correlation of the learned metrics. We also compare metrics trained with different training examples to measure the variations in their evaluation. Moreover, we perform a robustness analysis, which highlights the sensitivity of learned and handcrafted metrics to various sentence perturbations. Our empirical analysis shows that our proposed metric not only outperforms the existing metrics in terms of caption-level correlation but it also shows a strong system-level correlation against human assessments. △ Less

Submitted 24 December, 2020; originally announced December 2020.

Comments: 18 pages

Journal ref: International Journal of Computer Vision (October 2019)

arXiv:2012.13122 [pdf, other]

SubICap: Towards Subword-informed Image Captioning

Authors: Naeha Sharif, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah

Abstract: Existing Image Captioning (IC) systems model words as atomic units in captions and are unable to exploit the structural information in the words. This makes representation of rare words very difficult and out-of-vocabulary words impossible. Moreover, to avoid computational complexity, existing IC models operate over a modest sized vocabulary of frequent words, such that the identity of rare words… ▽ More Existing Image Captioning (IC) systems model words as atomic units in captions and are unable to exploit the structural information in the words. This makes representation of rare words very difficult and out-of-vocabulary words impossible. Moreover, to avoid computational complexity, existing IC models operate over a modest sized vocabulary of frequent words, such that the identity of rare words is lost. In this work we address this common limitation of IC systems in dealing with rare words in the corpora. We decompose words into smaller constituent units 'subwords' and represent captions as a sequence of subwords instead of words. This helps represent all words in the corpora using a significantly lower subword vocabulary, leading to better parameter learning. Using subword language modeling, our captioning system improves various metric scores, with a training vocabulary size approximately 90% less than the baseline and various state-of-the-art word-level models. Our quantitative and qualitative results and analysis signify the efficacy of our proposed approach. △ Less

Submitted 24 December, 2020; originally announced December 2020.

Comments: 8 pages

Journal ref: Workshop on Applications of Computer Vision (WACV), 2021

arXiv:2008.02567 [pdf, other]

doi 10.3390/s20092653

An Intelligent Non-Invasive Real Time Human Activity Recognition System for Next-Generation Healthcare

Authors: William Taylor, Syed Aziz Shah, Kia Dashtipour, Adnan Zahid, Qammer H. Abbasi, Muhammad Ali Imran

Abstract: Human motion detection is getting considerable attention in the field of Artificial Intelligence (AI) driven healthcare systems. Human motion can be used to provide remote healthcare solutions for vulnerable people by identifying particular movements such as falls, gait and breathing disorders. This can allow people to live more independent lifestyles and still have the safety of being monitored i… ▽ More Human motion detection is getting considerable attention in the field of Artificial Intelligence (AI) driven healthcare systems. Human motion can be used to provide remote healthcare solutions for vulnerable people by identifying particular movements such as falls, gait and breathing disorders. This can allow people to live more independent lifestyles and still have the safety of being monitored if more direct care is needed. At present wearable devices can provide real time monitoring by deploying equipment on a person's body. However, putting devices on a person's body all the time make it uncomfortable and the elderly tends to forget it to wear as well in addition to the insecurity of being tracked all the time. This paper demonstrates how human motions can be detected in quasi-real-time scenario using a non-invasive method. Patterns in the wireless signals presents particular human body motions as each movement induces a unique change in the wireless medium. These changes can be used to identify particular body motions. This work produces a dataset that contains patterns of radio wave signals obtained using software defined radios (SDRs) to establish if a subject is standing up or sitting down as a test case. The dataset was used to create a machine learning model, which was used in a developed application to provide a quasi-real-time classification of standing or sitting state. The machine learning model was able to achieve 96.70 % accuracy using the Random Forest algorithm using 10 fold cross validation. A benchmark dataset of wearable devices was compared to the proposed dataset and results showed the proposed dataset to have similar accuracy of nearly 90 %. The machine learning models developed in this paper are tested for two activities but the developed system is designed and applicable for detecting and differentiating x number of activities. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: 20 pages 18 figures, journal

Journal ref: Sensors 2020, 20(9), 2653

arXiv:2008.01170 [pdf, other]

Deep Learning Models for Early Detection and Prediction of the spread of Novel Coronavirus (COVID-19)

Authors: Devante Ayris, Kye Horbury, Blake Williams, Mitchell Blackney, Celine Shi Hui See, Maleeha Imtiaz, Syed Afaq Ali Shah

Abstract: SARS-CoV2, which causes coronavirus disease (COVID-19) is continuing to spread globally and has become a pandemic. People have lost their lives due to the virus and the lack of counter measures in place. Given the increasing caseload and uncertainty of spread, there is an urgent need to develop machine learning techniques to predict the spread of COVID-19. Prediction of the spread can allow counte… ▽ More SARS-CoV2, which causes coronavirus disease (COVID-19) is continuing to spread globally and has become a pandemic. People have lost their lives due to the virus and the lack of counter measures in place. Given the increasing caseload and uncertainty of spread, there is an urgent need to develop machine learning techniques to predict the spread of COVID-19. Prediction of the spread can allow counter measures and actions to be implemented to mitigate the spread of COVID-19. In this paper, we propose a deep learning technique, called Deep Sequential Prediction Model (DSPM) and machine learning based Non-parametric Regression Model (NRM) to predict the spread of COVID-19. Our proposed models were trained and tested on novel coronavirus 2019 dataset, which contains 19.53 Million confirmed cases of COVID-19. Our proposed models were evaluated by using Mean Absolute Error and compared with baseline method. Our experimental results, both quantitative and qualitative, demonstrate the superior prediction performance of the proposed models. △ Less

Submitted 15 February, 2021; v1 submitted 29 July, 2020; originally announced August 2020.

arXiv:2007.14741 [pdf, other]

CommuNety: A Deep Learning System for the Prediction of Cohesive Social Communities

Authors: Syed Afaq Ali Shah, Weifeng Deng, Jianxin Li, Muhammad Aamir Cheema, Abdul Bais

Abstract: Effective mining of social media, which consists of a large number of users is a challenging task. Traditional approaches rely on the analysis of text data related to users to accomplish this task. However, text data lacks significant information about the social users and their associated groups. In this paper, we propose CommuNety, a deep learning system for the prediction of cohesive social net… ▽ More Effective mining of social media, which consists of a large number of users is a challenging task. Traditional approaches rely on the analysis of text data related to users to accomplish this task. However, text data lacks significant information about the social users and their associated groups. In this paper, we propose CommuNety, a deep learning system for the prediction of cohesive social networks using images. The proposed deep learning model consists of hierarchical CNN architecture to learn descriptive features related to each cohesive network. The paper also proposes a novel Face Co-occurrence Frequency algorithm to quantify existence of people in images, and a novel photo ranking method to analyze the strength of relationship between different individuals in a predicted social network. We extensively evaluate the proposed technique on PIPA dataset and compare with state-of-the-art methods. Our experimental results demonstrate the superior performance of the proposed technique for the prediction of relationship between different individuals and the cohesiveness of communities. △ Less

Submitted 29 July, 2020; originally announced July 2020.

arXiv:2007.06013 [pdf, other]

MeDaS: An open-source platform as service to help break the walls between medicine and informatics

Authors: Liang Zhang, Johann Li, Ping Li, Xiaoyuan Lu, Peiyi Shen, Guangming Zhu, Syed Afaq Shah, Mohammed Bennarmoun, Kun Qian, Björn W. Schuller

Abstract: In the past decade, deep learning (DL) has achieved unprecedented success in numerous fields including computer vision, natural language processing, and healthcare. In particular, DL is experiencing an increasing development in applications for advanced medical image analysis in terms of analysis, segmentation, classification, and furthermore. On the one hand, tremendous needs that leverage the po… ▽ More In the past decade, deep learning (DL) has achieved unprecedented success in numerous fields including computer vision, natural language processing, and healthcare. In particular, DL is experiencing an increasing development in applications for advanced medical image analysis in terms of analysis, segmentation, classification, and furthermore. On the one hand, tremendous needs that leverage the power of DL for medical image analysis are arising from the research community of a medical, clinical, and informatics background to jointly share their expertise, knowledge, skills, and experience. On the other hand, barriers between disciplines are on the road for them often hampering a full and efficient collaboration. To this end, we propose our novel open-source platform, i.e., MeDaS -- the MeDical open-source platform as Service. To the best of our knowledge, MeDaS is the first open-source platform proving a collaborative and interactive service for researchers from a medical background easily using DL related toolkits, and at the same time for scientists or engineers from information sciences to understand the medical knowledge side. Based on a series of toolkits and utilities from the idea of RINV (Rapid Implementation aNd Verification), our proposed MeDaS platform can implement pre-processing, post-processing, augmentation, visualization, and other phases needed in medical image analysis. Five tasks including the subjects of lung, liver, brain, chest, and pathology, are validated and demonstrated to be efficiently realisable by using MeDaS. △ Less

Submitted 13 July, 2020; v1 submitted 12 July, 2020; originally announced July 2020.

Comments: layout error fixed

arXiv:2006.02879 [pdf, other]

Auto-decoding Graphs

Authors: Sohil Atul Shah, Vladlen Koltun

Abstract: We present an approach to synthesizing new graph structures from empirically specified distributions. The generative model is an auto-decoder that learns to synthesize graphs from latent codes. The graph synthesis model is learned jointly with an empirical distribution over the latent codes. Graphs are synthesized using self-attention modules that are trained to identify likely connectivity patter… ▽ More We present an approach to synthesizing new graph structures from empirically specified distributions. The generative model is an auto-decoder that learns to synthesize graphs from latent codes. The graph synthesis model is learned jointly with an empirical distribution over the latent codes. Graphs are synthesized using self-attention modules that are trained to identify likely connectivity patterns. Graph-based normalizing flows are used to sample latent codes from the distribution learned by the auto-decoder. The resulting model combines accuracy and scalability. On benchmark datasets of large graphs, the presented model outperforms the state of the art by a factor of 1.5 in mean accuracy and average rank across at least three different graph statistics, with a 2x speedup during inference. △ Less

Submitted 4 June, 2020; originally announced June 2020.

arXiv:2002.03741 [pdf, other]

Efficient Scene Text Detection with Textual Attention Tower

Authors: Liang Zhang, Yufei Liu, Hang Xiao, Lu Yang, Guangming Zhu, Syed Afaq Shah, Mohammed Bennamoun, Peiyi Shen

Abstract: Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multioriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false pos… ▽ More Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multioriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false positive detections. Experiments on public benchmarks including ICDAR 2013, ICDAR 2015 and MSRA-TD500 show that our proposed approach can achieve better or comparable performances with fewer parameters and less computational cost. △ Less

Submitted 30 January, 2020; originally announced February 2020.

Comments: Accepted by ICASSP 2020

arXiv:2002.00848 [pdf, other]

doi 10.1145/3366423.3380083

Structure-Feature based Graph Self-adaptive Pooling

Authors: Liang Zhang, Xudong Wang, Hongsheng Li, Guangming Zhu, Peiyi Shen, Ping Li, Xiaoyuan Lu, Syed Afaq Ali Shah, Mohammed Bennamoun

Abstract: Various methods to deal with graph data have been proposed in recent years. However, most of these methods focus on graph feature aggregation rather than graph pooling. Besides, the existing top-k selection graph pooling methods have a few problems. First, to construct the pooled graph topology, current top-k selection methods evaluate the importance of the node from a single perspective only, whi… ▽ More Various methods to deal with graph data have been proposed in recent years. However, most of these methods focus on graph feature aggregation rather than graph pooling. Besides, the existing top-k selection graph pooling methods have a few problems. First, to construct the pooled graph topology, current top-k selection methods evaluate the importance of the node from a single perspective only, which is simplistic and unobjective. Second, the feature information of unselected nodes is directly lost during the pooling process, which inevitably leads to a massive loss of graph feature information. To solve these problems mentioned above, we propose a novel graph self-adaptive pooling method with the following objectives: (1) to construct a reasonable pooled graph topology, structure and feature information of the graph are considered simultaneously, which provide additional veracity and objectivity in node selection; and (2) to make the pooled nodes contain sufficiently effective graph information, node feature information is aggregated before discarding the unimportant nodes; thus, the selected nodes contain information from neighbor nodes, which can enhance the use of features of the unselected nodes. Experimental results on four different datasets demonstrate that our method is effective in graph classification and outperforms state-of-the-art graph pooling methods. △ Less

Submitted 30 January, 2020; originally announced February 2020.

Comments: 7 pages, 4 figures, The Web Conference 2020

arXiv:1803.09470 [pdf, other]

Real Time Surveillance for Low Resolution and Limited-Data Scenarios: An Image Set Classification Approach

Authors: Uzair Nadeem, Syed Afaq Ali Shah, Mohammed Bennamoun, Roberto Togneri, Ferdous Sohel

Abstract: This paper proposes a novel image set classification technique based on the concept of linear regression. Unlike most other approaches, the proposed technique does not involve any training or feature extraction. The gallery image sets are represented as subspaces in a high dimensional space. Class specific gallery subspaces are used to estimate regression models for each image of the test image se… ▽ More This paper proposes a novel image set classification technique based on the concept of linear regression. Unlike most other approaches, the proposed technique does not involve any training or feature extraction. The gallery image sets are represented as subspaces in a high dimensional space. Class specific gallery subspaces are used to estimate regression models for each image of the test image set. Images of the test set are then projected on the gallery subspaces. Residuals, calculated using the Euclidean distance between the original and the projected test images, are used as the distance metric. Three different strategies are devised to decide on the final class of the test image set. We performed extensive evaluations of the proposed technique under the challenges of low resolution, noise and less gallery data for the tasks of surveillance, video-based face recognition and object recognition. Experiments show that the proposed technique achieves a better classification accuracy and a faster execution time compared to existing techniques especially under the challenging conditions of low resolution and small gallery and test data. △ Less

Submitted 3 March, 2019; v1 submitted 26 March, 2018; originally announced March 2018.

arXiv:1803.01449 [pdf, other]

Deep Continuous Clustering

Authors: Sohil Atul Shah, Vladlen Koltun

Abstract: Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The data is embedded into a lower-dimensional space by a deep autoencoder. The autoencoder is optimized as part of the clustering process. The resulting network produc… ▽ More Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The data is embedded into a lower-dimensional space by a deep autoencoder. The autoencoder is optimized as part of the clustering process. The resulting network produces clustered data. The presented approach does not rely on prior knowledge of the number of ground-truth clusters. Joint nonlinear dimensionality reduction and clustering are formulated as optimization of a global continuous objective. We thus avoid discrete reconfigurations of the objective that characterize prior clustering algorithms. Experiments on datasets from multiple domains demonstrate that the presented algorithm outperforms state-of-the-art clustering schemes, including recent methods that use deep networks. △ Less

Submitted 4 March, 2018; originally announced March 2018.

Comments: The code is available at http://github.com/shahsohil/DCC

arXiv:1802.01117 [pdf, other]

Small Cell Association with Networked Flying Platforms: Novel Algorithms and Performance Bounds

Authors: Syed Awais Wahab Shah, Tamer Khattab, Muhammad Zeeshan Shakir, Mohammad Galal Khafagy, Mazen Omar Hasna

Abstract: Fifth generation (5G) and beyond-5G (B5G) systems expect coverage and capacity enhancements along with the consideration of limited power, cost and spectrum. Densification of small cells (SCs) is a promising approach to cater these demands of 5G and B5G systems. However, such an ultra dense network of SCs requires provision of smart backhaul and fronthaul networks. In this paper, we employ a scala… ▽ More Fifth generation (5G) and beyond-5G (B5G) systems expect coverage and capacity enhancements along with the consideration of limited power, cost and spectrum. Densification of small cells (SCs) is a promising approach to cater these demands of 5G and B5G systems. However, such an ultra dense network of SCs requires provision of smart backhaul and fronthaul networks. In this paper, we employ a scalable idea of using networked flying platforms (NFPs) as aerial hubs to provide fronthaul connectivity to the SCs. We consider the association problem of SCs and NFPs in a SC network and study the effect of practical constraints related to the system and NFPs. Mainly, we show that the association problem is related to the generalized assignment problem (GAP). Using this relation with the GAP, we show the NP-hard complexity of the association problem and further derive an upper bound for the maximum achievable sum data rate. Linear Programming relaxation of the problem is also studied to compare the results with the derived bounds. Finally, two efficient (less complex) greedy solutions of the association problem are presented, where one of them is a distributed solution and the other one is its centralized version. Numerical results show a favorable performance of the presented algorithms with respect to the exhaustive search and derived bounds. The computational complexity comparison of the algorithms with the exhaustive search is also presented to show that the presented algorithms can be practically implemented. △ Less

Submitted 4 February, 2018; originally announced February 2018.

Comments: Submitted to IEEE JSAC Special Issue on Airborne Communication Networks 2018, 30 pages and 8 figures

arXiv:1710.04843 [pdf]

doi 10.1016/j.future.2017.10.016

Performance Comparison of Intrusion Detection Systems and Application of Machine Learning to Snort System

Authors: Syed Ali Raza Shah, Biju Issac

Abstract: This study investigates the performance of two open source intrusion detection systems (IDSs) namely Snort and Suricata for accurately detecting the malicious traffic on computer networks. Snort and Suricata were installed on two different but identical computers and the performance was evaluated at 10 Gbps network speed. It was noted that Suricata could process a higher speed of network traffic t… ▽ More This study investigates the performance of two open source intrusion detection systems (IDSs) namely Snort and Suricata for accurately detecting the malicious traffic on computer networks. Snort and Suricata were installed on two different but identical computers and the performance was evaluated at 10 Gbps network speed. It was noted that Suricata could process a higher speed of network traffic than Snort with lower packet drop rate but it consumed higher computational resources. Snort had higher detection accuracy and was thus selected for further experiments. It was observed that the Snort triggered a high rate of false positive alarms. To solve this problem a Snort adaptive plug-in was developed. To select the best performing algorithm for Snort adaptive plug-in, an empirical study was carried out with different learning algorithms and Support Vector Machine (SVM) was selected. A hybrid version of SVM and Fuzzy logic produced a better detection accuracy. But the best result was achieved using an optimised SVM with firefly algorithm with FPR (false positive rate) as 8.6% and FNR (false negative rate) as 2.2%, which is a good result. The novelty of this work is the performance comparison of two IDSs at 10 Gbps and the application of hybrid and optimised machine learning algorithms to Snort. △ Less

Submitted 7 November, 2017; v1 submitted 13 October, 2017; originally announced October 2017.

Comments: 25 pages

Journal ref: S.A.R. Shah, B. Issac, (2018). Performance Comparison of Intrusion Detection Systems and Application of Machine Learning to Snort System, Future Generation Computer Systems, Elsevier, ISSN 0167-739X, Vol. 80, 157-170

arXiv:1707.03510 [pdf, other]

Association of Networked Flying Platforms with Small Cells for Network Centric 5G+ C-RAN

Authors: Syed Awais Wahab Shah, Tamer Khattab, Muhammad Zeeshan Shakir, Mazen Omar Hasna

Abstract: 5G+ systems expect enhancement in data rate and coverage area under limited power constraint. Such requirements can be fulfilled by the densification of small cells (SCs). However, a major challenge is the management of fronthaul links connected with an ultra dense network of SCs. A cost effective and scalable idea of using network flying platforms (NFPs) is employed here, where the NFPs are used… ▽ More 5G+ systems expect enhancement in data rate and coverage area under limited power constraint. Such requirements can be fulfilled by the densification of small cells (SCs). However, a major challenge is the management of fronthaul links connected with an ultra dense network of SCs. A cost effective and scalable idea of using network flying platforms (NFPs) is employed here, where the NFPs are used as fronthaul hubs that connect the SCs to the core network. The association problem of NFPs and SCs is formulated considering a number of practical constraints such as backhaul data rate limit, maximum supported links and bandwidth by NFPs and quality of service requirement of the system. The network centric case of the system is considered that aims to maximize the number of associated SCs without any biasing, i.e., no preference for high priority SCs. Then, two new efficient greedy algorithms are designed to solve the presented association problem. Numerical results show a favorable performance of our proposed methods in comparison to exhaustive search. △ Less

Submitted 11 July, 2017; originally announced July 2017.

Comments: Submitted to IEEE PIMRC 2017, 7 pages and 5 figures

arXiv:1705.03304 [pdf, other]

A Distributed Approach for Networked Flying Platform Association with Small Cells in 5G+ Networks

Authors: Syed Awais Wahab Shah, Tamer Khattab, Muhammad Zeeshan Shakir, Mazen Omar Hasna

Abstract: The densification of small-cell base stations in a 5G architecture is a promising approach to enhance the coverage area and facilitate the ever increasing capacity demand of end users. However, the bottleneck is an intelligent management of a backhaul/fronthaul network for these small-cell base stations. This involves efficient association and placement of the backhaul hubs that connects these sma… ▽ More The densification of small-cell base stations in a 5G architecture is a promising approach to enhance the coverage area and facilitate the ever increasing capacity demand of end users. However, the bottleneck is an intelligent management of a backhaul/fronthaul network for these small-cell base stations. This involves efficient association and placement of the backhaul hubs that connects these small-cells with the core network. Terrestrial hubs suffer from an inefficient non line of sight link limitations and unavailability of a proper infrastructure in an urban area. Seeing the popularity of flying platforms, we employ here an idea of using networked flying platform (NFP) such as unmanned aerial vehicles (UAVs), drones, unmanned balloons flying at different altitudes, as aerial backhaul hubs. The association problem of these NFP-hubs and small-cell base stations is formulated considering backhaul link and NFP related limitations such as maximum number of supported links and bandwidth. Then, this paper presents an efficient and distributed solution of the designed problem, which performs a greedy search in order to maximize the sum rate of the overall network. A favorable performance is observed via a numerical comparison of our proposed method with optimal exhaustive search algorithm in terms of sum rate and run-time speed. △ Less

Submitted 21 April, 2017; originally announced May 2017.

Comments: Submitted to IEEE GLOBECOM 2017, 7 pages and 4 figures

arXiv:1701.02485 [pdf, other]

Efficient Image Set Classification using Linear Regression based Image Reconstruction

Authors: Syed Afaq Ali Shah, Uzair Nadeem, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

Abstract: We propose a novel image set classification technique using linear regression models. Downsampled gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step. We estimate regression models for each test image using the class specific gallery subspaces. Images of the test set are then reconstructed using the regression models. Bas… ▽ More We propose a novel image set classification technique using linear regression models. Downsampled gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step. We estimate regression models for each test image using the class specific gallery subspaces. Images of the test set are then reconstructed using the regression models. Based on the minimum reconstruction error between the reconstructed and the original images, a weighted voting strategy is used to classify the test set. We performed extensive evaluation on the benchmark UCSD/Honda, CMU Mobo and YouTube Celebrity datasets for face classification, and ETH-80 dataset for object classification. The results demonstrate that by using only a small amount of training data, our technique achieved competitive classification accuracy and superior computational speed compared with the state-of-the-art methods. △ Less

Submitted 10 January, 2017; originally announced January 2017.

arXiv:1605.07329 [pdf]

doi 10.1109/JSYST.2016.2573680

Adaptive Beaconing Approaches for Vehicular ad hoc Networks: A Survey

Authors: Syed Adeel Ali Shah, Ejaz Ahmed, Feng Xia, Ahmad Karim, Muhammad Shiraz, Rafidah MD Noor

Abstract: Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope… ▽ More Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope with application requirements. To this end, this paper is motivated by the classifications and capability evaluations of existing adaptive beaconing approaches. To begin with, we explore the anatomy and the performance requirements of beaconing. Then, the beaconing design is analyzed to introduce a design-based beaconing taxonomy. A survey of the state-of-the-art is conducted with an emphasis on the salient features of the beaconing approaches. We also evaluate the capabilities of beaconing approaches using several key parameters. A comparison among beaconing approaches is presented, which is based on the architectural and implementation characteristics. The paper concludes by discussing open challenges in the field. △ Less

Submitted 24 May, 2016; originally announced May 2016.

arXiv:1506.06650 [pdf, ps, other]

Blind Source Separation Algorithms Using Hyperbolic and Givens Rotations for High-Order QAM Constellations

Authors: Syed A. W. Shah, Karim Abed-Meraim, Tareq Y. Al-Naffouri

Abstract: This paper addresses the problem of blind demixing of instantaneous mixtures in a multiple-input multiple-output communication system. The main objective is to present efficient blind source separation (BSS) algorithms dedicated to moderate or high-order QAM constellations. Four new iterative batch BSS algorithms are presented dealing with the multimodulus (MM) and alphabet matched (AM) criteria.… ▽ More This paper addresses the problem of blind demixing of instantaneous mixtures in a multiple-input multiple-output communication system. The main objective is to present efficient blind source separation (BSS) algorithms dedicated to moderate or high-order QAM constellations. Four new iterative batch BSS algorithms are presented dealing with the multimodulus (MM) and alphabet matched (AM) criteria. For the optimization of these cost functions, iterative methods of Givens and hyperbolic rotations are used. A pre-whitening operation is also utilized to reduce the complexity of design problem. It is noticed that the designed algorithms using Givens rotations gives satisfactory performance only for large number of samples. However, for small number of samples, the algorithms designed by combining both Givens and hyperbolic rotations compensate for the ill-whitening that occurs in this case and thus improves the performance. Two algorithms dealing with the MM criterion are presented for moderate order QAM signals such as 16-QAM. The other two dealing with the AM criterion are presented for high-order QAM signals. These methods are finally compared with the state of art batch BSS algorithms in terms of signal-to-interference and noise ratio, symbol error rate and convergence rate. Simulation results show that the proposed methods outperform the contemporary batch BSS algorithms. △ Less

Submitted 29 June, 2016; v1 submitted 22 June, 2015; originally announced June 2015.

Comments: 13 pages, 11 figures, submitted to IEEE Trans. Signal Process, updated: New algorithms added

arXiv:1307.7111 [pdf, ps, other]

doi 10.1109/BWCCA.2013.25

LPCH and UDLPCH: Location-aware Routing Techniques in WSNs

Authors: Y. Khan, N. Javaid, M. J. Khan, Y. Ahmad, M. H. Zubair, S. A. Shah

Abstract: Wireless sensor nodes along with Base Station (BS) constitute a Wireless Sensor Network (WSN). Nodes comprise of tiny power battery. Nodes sense the data and send it to BS. WSNs need protocol for efficient energy consumption of the network. In direct transmission and minimum transmission energy routing protocols, energy consumption is not well distributed. However, LEACH (Low-Energy Adaptive Clust… ▽ More Wireless sensor nodes along with Base Station (BS) constitute a Wireless Sensor Network (WSN). Nodes comprise of tiny power battery. Nodes sense the data and send it to BS. WSNs need protocol for efficient energy consumption of the network. In direct transmission and minimum transmission energy routing protocols, energy consumption is not well distributed. However, LEACH (Low-Energy Adaptive Clustering Hierarchy) is a clustering protocol; randomly selects the Cluster Heads (CHs) in each round. However, random selection of CHs does not guarantee efficient energy consumption of the network. Therefore, we proposed new clustering techniques in routing protocols, Location-aware Permanent CH (LPCH) and User Defined Location-aware Permanent CH (UDLPCH). In both protocols, network field is physically divided in to two regions, equal number of nodes are randomly deployed in each region. In LPCH, number of CHs are selected by LEACH algorithm in first round. However in UDLPCH, equal and optimum number of CHs are selected in each region, throughout the network life time number of CHs are remain same. Simulation results show that stability period and throughput of LPCH is greater than LEACH, stability period and throughput of UDLPCH is greater than LPCH. △ Less

Submitted 26 July, 2013; originally announced July 2013.

Comments: IEEE 8th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA'13), Compiegne, France

Showing 1–35 of 35 results for author: Shah, S A