Search | arXiv e-print repository

BD-SAT: High-resolution Land Use Land Cover Dataset & Benchmark Results for Developing Division: Dhaka, BD

Authors: Ovi Paul, Abu Bakar Siddik Nayem, Anis Sarker, Amin Ahsan Ali, M Ashraful Amin, AKM Mahbubur Rahman

Abstract: Land Use Land Cover (LULC) analysis on satellite images using deep learning-based methods is significantly helpful in understanding the geography, socio-economic conditions, poverty levels, and urban sprawl in developing countries. Recent works involve segmentation with LULC classes such as farmland, built-up areas, forests, meadows, water bodies, etc. Training deep learning methods on satellite i… ▽ More Land Use Land Cover (LULC) analysis on satellite images using deep learning-based methods is significantly helpful in understanding the geography, socio-economic conditions, poverty levels, and urban sprawl in developing countries. Recent works involve segmentation with LULC classes such as farmland, built-up areas, forests, meadows, water bodies, etc. Training deep learning methods on satellite images requires large sets of images annotated with LULC classes. However, annotated data for developing countries are scarce due to a lack of funding, absence of dedicated residential/industrial/economic zones, a large population, and diverse building materials. BD-SAT provides a high-resolution dataset that includes pixel-by-pixel LULC annotations for Dhaka metropolitan city and surrounding rural/urban areas. Using a strict and standardized procedure, the ground truth is created using Bing satellite imagery with a ground spatial distance of 2.22 meters per pixel. A three-stage, well-defined annotation process has been followed with support from GIS experts to ensure the reliability of the annotations. We performed several experiments to establish benchmark results. The results show that the annotated BD-SAT is sufficient to train large deep learning models with adequate accuracy for five major LULC classes: forest, farmland, built-up areas, water bodies, and meadows. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 26 pages, 15 figures and 12 tables

arXiv:2403.13514 [pdf, other]

How Gender Interacts with Political Values: A Case Study on Czech BERT Models

Authors: Adnan Al Ali, Jindřich Libovický

Abstract: Neural language models, which reach state-of-the-art results on most natural language processing tasks, are trained on large text corpora that inevitably contain value-burdened content and often capture undesirable biases, which the models reflect. This case study focuses on the political biases of pre-trained encoders in Czech and compares them with a representative value survey. Because Czech is… ▽ More Neural language models, which reach state-of-the-art results on most natural language processing tasks, are trained on large text corpora that inevitably contain value-burdened content and often capture undesirable biases, which the models reflect. This case study focuses on the political biases of pre-trained encoders in Czech and compares them with a representative value survey. Because Czech is a gendered language, we also measure how the grammatical gender coincides with responses to men and women in the survey. We introduce a novel method for measuring the model's perceived political values. We find that the models do not assign statement probability following value-driven reasoning, and there is no systematic difference between feminine and masculine sentences. We conclude that BERT-sized models do not manifest systematic alignment with political values and that the biases observed in the models are rather due to superficial imitation of training data patterns than systematic value beliefs encoded in the models. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 11 pages, 2 figures; LREC-COLING 2024

arXiv:2306.00031 [pdf, other]

doi 10.1016/j.procs.2023.08.198

Morphological Classification of Radio Galaxies using Semi-Supervised Group Equivariant CNNs

Authors: Mir Sazzat Hossain, Sugandha Roy, K. M. B. Asad, Arshad Momen, Amin Ahsan Ali, M Ashraful Amin, A. K. M. Mahbubur Rahman

Abstract: Out of the estimated few trillion galaxies, only around a million have been detected through radio frequencies, and only a tiny fraction, approximately a thousand, have been manually classified. We have addressed this disparity between labeled and unlabeled images of radio galaxies by employing a semi-supervised learning approach to classify them into the known Fanaroff-Riley Type I (FRI) and Type… ▽ More Out of the estimated few trillion galaxies, only around a million have been detected through radio frequencies, and only a tiny fraction, approximately a thousand, have been manually classified. We have addressed this disparity between labeled and unlabeled images of radio galaxies by employing a semi-supervised learning approach to classify them into the known Fanaroff-Riley Type I (FRI) and Type II (FRII) categories. A Group Equivariant Convolutional Neural Network (G-CNN) was used as an encoder of the state-of-the-art self-supervised methods SimCLR (A Simple Framework for Contrastive Learning of Visual Representations) and BYOL (Bootstrap Your Own Latent). The G-CNN preserves the equivariance for the Euclidean Group E(2), enabling it to effectively learn the representation of globally oriented feature maps. After representation learning, we trained a fully-connected classifier and fine-tuned the trained encoder with labeled data. Our findings demonstrate that our semi-supervised approach outperforms existing state-of-the-art methods across several metrics, including cluster quality, convergence rate, accuracy, precision, recall, and the F1-score. Moreover, statistical significance testing via a t-test revealed that our method surpasses the performance of a fully supervised G-CNN. This study emphasizes the importance of semi-supervised learning in radio galaxy classification, where labeled data are still scarce, but the prospects for discovery are immense. △ Less

Submitted 31 May, 2023; originally announced June 2023.

Comments: 9 pages, 6 figures, accepted in INNS Deep Learning Innovations and Applications (INNS DLIA 2023) workshop, IJCNN 2023, to be published in Procedia Computer Science

Journal ref: Procedia Computer Science, Volume 222, 2023, Pages 601-612

arXiv:2304.00622 [pdf, other]

Automatic Detection of Natural Disaster Effect on Paddy Field from Satellite Images using Deep Learning Techniques

Authors: Tahmid Alavi Ishmam, Amin Ahsan Ali, Md Ahsraful Amin, A K M Mahbubur Rahman

Abstract: This paper aims to detect rice field damage from natural disasters in Bangladesh using high-resolution satellite imagery. The authors developed ground truth data for rice field damage from the field level. At first, NDVI differences before and after the disaster are calculated to identify possible crop loss. The areas equal to and above the 0.33 threshold are marked as crop loss areas as significa… ▽ More This paper aims to detect rice field damage from natural disasters in Bangladesh using high-resolution satellite imagery. The authors developed ground truth data for rice field damage from the field level. At first, NDVI differences before and after the disaster are calculated to identify possible crop loss. The areas equal to and above the 0.33 threshold are marked as crop loss areas as significant changes are observed. The authors also verified crop loss areas by collecting data from local farmers. Later, different bands of satellite data (Red, Green, Blue) and (False Color Infrared) are useful to detect crop loss area. We used the NDVI different images as ground truth to train the DeepLabV3plus model. With RGB, we got IoU 0.41 and with FCI, we got IoU 0.51. As FCI uses NIR, Red, Blue bands and NDVI is normalized difference between NIR and Red bands, so greater FCI's IoU score than RGB is expected. But RGB does not perform very badly here. So, where other bands are not available, RGB can use to understand crop loss areas to some extent. The ground truth developed in this paper can be used for segmentation models with very high resolution RGB only images such as Bing, Google etc. △ Less

Submitted 2 April, 2023; originally announced April 2023.

Comments: 6 pages, 13 figures. This paper has been accepted for presentation at the ICCRE2023 conference, held at Nagaoka University of Technology, Japan

arXiv:2209.13715 [pdf, other]

Safe Balancing Control of a Soft Legged Robot

Authors: Ran Jing, Meredith L. Anderson, Miguel Ianus-Valdivia, Amsal Akber Ali, Carmel Majidi, Andrew P. Sabelhaus

Abstract: Legged robots constructed from soft materials are commonly claimed to demonstrate safer, more robust environmental interactions than their rigid counterparts. However, this motivating feature of soft robots requires more rigorous development for comparison to rigid locomotion. This article presents a soft legged robot platform, Horton, and a feedback control system with safety guarantees on some a… ▽ More Legged robots constructed from soft materials are commonly claimed to demonstrate safer, more robust environmental interactions than their rigid counterparts. However, this motivating feature of soft robots requires more rigorous development for comparison to rigid locomotion. This article presents a soft legged robot platform, Horton, and a feedback control system with safety guarantees on some aspects of its operation. The robot is constructed using a series of soft limbs, actuated by thermal shape memory alloy (SMA) wire muscles, with sensors for its position and its actuator temperatures. A supervisory control scheme maintains safe actuator states during the operation of a separate controller for the robot's pose. Experiments demonstrate that Horton can lift its leg and maintain a balancing stance, a precursor to locomotion. The supervisor is verified in hardware via a human interaction test during balancing, keeping all SMA muscles below a temperature threshold. This work represents the first demonstration of a safety-verified feedback system on any soft legged robot. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: 8 pages, 4 figures

arXiv:2205.00581 [pdf]

Using a novel fractional-order gradient method for CNN back-propagation

Authors: Mundher Mohammed Taresh, Ningbo Zhu, Talal Ahmed Ali Ali, Mohammed Alghaili, Weihua Guo

Abstract: Computer-aided diagnosis tools have experienced rapid growth and development in recent years. Among all, deep learning is the most sophisticated and popular tool. In this paper, researchers propose a novel deep learning model and apply it to COVID-19 diagnosis. Our model uses the tool of fractional calculus, which has the potential to improve the performance of gradient methods. To this end, the r… ▽ More Computer-aided diagnosis tools have experienced rapid growth and development in recent years. Among all, deep learning is the most sophisticated and popular tool. In this paper, researchers propose a novel deep learning model and apply it to COVID-19 diagnosis. Our model uses the tool of fractional calculus, which has the potential to improve the performance of gradient methods. To this end, the researcher proposes a fractional-order gradient method for the back-propagation of convolutional neural networks based on the Caputo definition. However, if only the first term of the infinite series of the Caputo definition is used to approximate the fractional-order derivative, the length of the memory is truncated. Therefore, the fractional-order gradient (FGD) method with a fixed memory step and an adjustable number of terms is used to update the weights of the layers. Experiments were performed on the COVIDx dataset to demonstrate fast convergence, good accuracy, and the ability to bypass the local optimal point. We also compared the performance of the developed fractional-order neural networks and Integer-order neural networks. The results confirmed the effectiveness of our proposed model in the diagnosis of COVID-19. △ Less

Submitted 1 May, 2022; originally announced May 2022.

Comments: 9 pages, 6 figuers

MSC Class: D.1.2; F.3.1; F.4.1 ACM Class: F.2.2, I.2.7 K.5

arXiv:2202.12689 [pdf, other]

Domain Adaptation: the Key Enabler of Neural Network Equalizers in Coherent Optical Systems

Authors: Pedro J. Freire, Bernhard Spinnler, Daniel Abode, Jaroslaw E. Prilepsky, Abdallah A. I. Ali, Nelson Costa, Wolfgang Schairer, Antonio Napoli, Andrew D. Ellis, Sergei K. Turitsyn

Abstract: We introduce the domain adaptation and randomization approach for calibrating neural network-based equalizers for real transmissions, using synthetic data. The approach renders up to 99\% training process reduction, which we demonstrate in three experimental setups. We introduce the domain adaptation and randomization approach for calibrating neural network-based equalizers for real transmissions, using synthetic data. The approach renders up to 99\% training process reduction, which we demonstrate in three experimental setups. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: Paper Accepted at OFC 2022

arXiv:2201.00985 [pdf, other]

Variational Stacked Local Attention Networks for Diverse Video Captioning

Authors: Tonmoay Deb, Akib Sadmanee, Kishor Kumar Bhaumik, Amin Ahsan Ali, M Ashraful Amin, A K M Mahbubur Rahman

Abstract: While describing Spatio-temporal events in natural language, video captioning models mostly rely on the encoder's latent visual representation. Recent progress on the encoder-decoder model attends encoder features mainly in linear interaction with the decoder. However, growing model complexity for visual data encourages more explicit feature interaction for fine-grained information, which is curre… ▽ More While describing Spatio-temporal events in natural language, video captioning models mostly rely on the encoder's latent visual representation. Recent progress on the encoder-decoder model attends encoder features mainly in linear interaction with the decoder. However, growing model complexity for visual data encourages more explicit feature interaction for fine-grained information, which is currently absent in the video captioning domain. Moreover, feature aggregations methods have been used to unveil richer visual representation, either by the concatenation or using a linear layer. Though feature sets for a video semantically overlap to some extent, these approaches result in objective mismatch and feature redundancy. In addition, diversity in captions is a fundamental component of expressing one event from several meaningful perspectives, currently missing in the temporal, i.e., video captioning domain. To this end, we propose Variational Stacked Local Attention Network (VSLAN), which exploits low-rank bilinear pooling for self-attentive feature interaction and stacking multiple video feature streams in a discount fashion. Each feature stack's learned attributes contribute to our proposed diversity encoding module, followed by the decoding query stage to facilitate end-to-end diverse and natural captions without any explicit supervision on attributes. We evaluate VSLAN on MSVD and MSR-VTT datasets in terms of syntax and diversity. The CIDEr score of VSLAN outperforms current off-the-shelf methods by $7.8\%$ on MSVD and $4.5\%$ on MSR-VTT, respectively. On the same datasets, VSLAN achieves competitive results in caption diversity metrics. △ Less

Submitted 4 January, 2022; originally announced January 2022.

Comments: To be published in Winter Conference on Applications of Computer Vision 2022

arXiv:2108.12734 [pdf, other]

Deep Dive into Semi-Supervised ELBO for Improving Classification Performance

Authors: Fahim Faisal Niloy, M. Ashraful Amin, AKM Mahbubur Rahman, Amin Ahsan Ali

Abstract: Decomposition of the evidence lower bound (ELBO) objective of VAE used for density estimation revealed the deficiency of VAE for representation learning and suggested ways to improve the model. In this paper, we investigate whether we can get similar insights by decomposing the ELBO for semi-supervised classification using VAE model. Specifically, we show that mutual information between input and… ▽ More Decomposition of the evidence lower bound (ELBO) objective of VAE used for density estimation revealed the deficiency of VAE for representation learning and suggested ways to improve the model. In this paper, we investigate whether we can get similar insights by decomposing the ELBO for semi-supervised classification using VAE model. Specifically, we show that mutual information between input and class labels decreases during maximization of ELBO objective. We propose a method to address this issue. We also enforce cluster assumption to aid in classification. Experiments on a diverse datasets verify that our method can be used to improve the classification performance of existing VAE based semi-supervised models. Experiments also show that, this can be achieved without sacrificing the generative power of the model. △ Less

Submitted 20 November, 2022; v1 submitted 28 August, 2021; originally announced August 2021.

Comments: Under Review

arXiv:2107.01284 [pdf, other]

doi 10.1109/ICPR48806.2021.9412504

A Novel Disaster Image Dataset and Characteristics Analysis using Attention Model

Authors: Fahim Faisal Niloy, Arif, Abu Bakar Siddik Nayem, Anis Sarker, Ovi Paul, M. Ashraful Amin, Amin Ahsan Ali, Moinul Islam Zaber, AKM Mahbubur Rahman

Abstract: The advancement of deep learning technology has enabled us to develop systems that outperform any other classification technique. However, success of any empirical system depends on the quality and diversity of the data available to train the proposed system. In this research, we have carefully accumulated a relatively challenging dataset that contains images collected from various sources for thr… ▽ More The advancement of deep learning technology has enabled us to develop systems that outperform any other classification technique. However, success of any empirical system depends on the quality and diversity of the data available to train the proposed system. In this research, we have carefully accumulated a relatively challenging dataset that contains images collected from various sources for three different disasters: fire, water and land. Besides this, we have also collected images for various damaged infrastructure due to natural or man made calamities and damaged human due to war or accidents. We have also accumulated image data for a class named non-damage that contains images with no such disaster or sign of damage in them. There are 13,720 manually annotated images in this dataset, each image is annotated by three individuals. We are also providing discriminating image class information annotated manually with bounding box for a set of 200 test images. Images are collected from different news portals, social media, and standard datasets made available by other researchers. A three layer attention model (TLAM) is trained and average five fold validation accuracy of 95.88% is achieved. Moreover, on the 200 unseen test images this accuracy is 96.48%. We also generate and compare attention maps for these test images to determine the characteristics of the trained attention model. Our dataset is available at https://niloy193.github.io/Disaster-Dataset △ Less

Submitted 2 July, 2021; originally announced July 2021.

Comments: ICPR 2020

arXiv:2106.12902 [pdf, other]

doi 10.1109/ICIP42928.2021.9506194

Attention Toward Neighbors: A Context Aware Framework for High Resolution Image Segmentation

Authors: Fahim Faisal Niloy, M. Ashraful Amin, Amin Ahsan Ali, AKM Mahbubur Rahman

Abstract: High-resolution image segmentation remains challenging and error-prone due to the enormous size of intermediate feature maps. Conventional methods avoid this problem by using patch based approaches where each patch is segmented independently. However, independent patch segmentation induces errors, particularly at the patch boundary due to the lack of contextual information in very high-resolution… ▽ More High-resolution image segmentation remains challenging and error-prone due to the enormous size of intermediate feature maps. Conventional methods avoid this problem by using patch based approaches where each patch is segmented independently. However, independent patch segmentation induces errors, particularly at the patch boundary due to the lack of contextual information in very high-resolution images where the patch size is much smaller compared to the full image. To overcome these limitations, in this paper, we propose a novel framework to segment a particular patch by incorporating contextual information from its neighboring patches. This allows the segmentation network to see the target patch with a wider field of view without the need of larger feature maps. Comparative analysis from a number of experiments shows that our proposed framework is able to segment high resolution images with significantly improved mean Intersection over Union and overall accuracy. △ Less

Submitted 24 June, 2021; originally announced June 2021.

Comments: Accepted at ICIP 2021

arXiv:2104.13014 [pdf, other]

Node Embedding using Mutual Information and Self-Supervision based Bi-level Aggregation

Authors: Kashob Kumar Roy, Amit Roy, A K M Mahbubur Rahman, M Ashraful Amin, Amin Ahsan Ali

Abstract: Graph Neural Networks (GNNs) learn low dimensional representations of nodes by aggregating information from their neighborhood in graphs. However, traditional GNNs suffer from two fundamental shortcomings due to their local ($l$-hop neighborhood) aggregation scheme. First, not all nodes in the neighborhood carry relevant information for the target node. Since GNNs do not exclude noisy nodes in the… ▽ More Graph Neural Networks (GNNs) learn low dimensional representations of nodes by aggregating information from their neighborhood in graphs. However, traditional GNNs suffer from two fundamental shortcomings due to their local ($l$-hop neighborhood) aggregation scheme. First, not all nodes in the neighborhood carry relevant information for the target node. Since GNNs do not exclude noisy nodes in their neighborhood, irrelevant information gets aggregated, which reduces the quality of the representation. Second, traditional GNNs also fail to capture long-range non-local dependencies between nodes. To address these limitations, we exploit mutual information (MI) to define two types of neighborhood, 1) \textit{Local Neighborhood} where nodes are densely connected within a community and each node would share higher MI with its neighbors, and 2) \textit{Non-Local Neighborhood} where MI-based node clustering is introduced to assemble informative but graphically distant nodes in the same cluster. To generate node presentations, we combine the embeddings generated by bi-level aggregation - local aggregation to aggregate features from local neighborhoods to avoid noisy information and non-local aggregation to aggregate features from non-local neighborhoods. Furthermore, we leverage self-supervision learning to estimate MI with few labeled data. Finally, we show that our model significantly outperforms the state-of-the-art methods in a wide range of assortative and disassortative graphs. △ Less

Submitted 27 April, 2021; originally announced April 2021.

Comments: Accepted at IJCNN 2021

arXiv:2104.13012 [pdf, other]

Structure-Aware Hierarchical Graph Pooling using Information Bottleneck

Authors: Kashob Kumar Roy, Amit Roy, A K M Mahbubur Rahman, M Ashraful Amin, Amin Ahsan Ali

Abstract: Graph pooling is an essential ingredient of Graph Neural Networks (GNNs) in graph classification and regression tasks. For these tasks, different pooling strategies have been proposed to generate a graph-level representation by downsampling and summarizing nodes' features in a graph. However, most existing pooling methods are unable to capture distinguishable structural information effectively. Be… ▽ More Graph pooling is an essential ingredient of Graph Neural Networks (GNNs) in graph classification and regression tasks. For these tasks, different pooling strategies have been proposed to generate a graph-level representation by downsampling and summarizing nodes' features in a graph. However, most existing pooling methods are unable to capture distinguishable structural information effectively. Besides, they are prone to adversarial attacks. In this work, we propose a novel pooling method named as {HIBPool} where we leverage the Information Bottleneck (IB) principle that optimally balances the expressiveness and robustness of a model to learn representations of input data. Furthermore, we introduce a novel structure-aware Discriminative Pooling Readout ({DiP-Readout}) function to capture the informative local subgraph structures in the graph. Finally, our experimental results show that our model significantly outperforms other state-of-art methods on several graph classification benchmarks and more resilient to feature-perturbation attack than existing pooling methods. △ Less

Submitted 27 April, 2021; originally announced April 2021.

Comments: Accepted at IJCNN 2021

arXiv:2104.12518 [pdf, other]

Unified Spatio-Temporal Modeling for Traffic Forecasting using Graph Neural Network

Authors: Amit Roy, Kashob Kumar Roy, Amin Ahsan Ali, M Ashraful Amin, A K M Mahbubur Rahman

Abstract: Research in deep learning models to forecast traffic intensities has gained great attention in recent years due to their capability to capture the complex spatio-temporal relationships within the traffic data. However, most state-of-the-art approaches have designed spatial-only (e.g. Graph Neural Networks) and temporal-only (e.g. Recurrent Neural Networks) modules to separately extract spatial and… ▽ More Research in deep learning models to forecast traffic intensities has gained great attention in recent years due to their capability to capture the complex spatio-temporal relationships within the traffic data. However, most state-of-the-art approaches have designed spatial-only (e.g. Graph Neural Networks) and temporal-only (e.g. Recurrent Neural Networks) modules to separately extract spatial and temporal features. However, we argue that it is less effective to extract the complex spatio-temporal relationship with such factorized modules. Besides, most existing works predict the traffic intensity of a particular time interval only based on the traffic data of the previous one hour of that day. And thereby ignores the repetitive daily/weekly pattern that may exist in the last hour of data. Therefore, we propose a Unified Spatio-Temporal Graph Convolution Network (USTGCN) for traffic forecasting that performs both spatial and temporal aggregation through direct information propagation across different timestamp nodes with the help of spectral graph convolution on a spatio-temporal graph. Furthermore, it captures historical daily patterns in previous days and current-day patterns in current-day traffic data. Finally, we validate our work's effectiveness through experimental analysis, which shows that our model USTGCN can outperform state-of-the-art performances in three popular benchmark datasets from the Performance Measurement System (PeMS). Moreover, the training time is reduced significantly with our proposed USTGCN model. △ Less

Submitted 28 April, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

Comments: Accepted for publication in International Joint Conference on Neural Networks (IJCNN-2021)

arXiv:2104.00055 [pdf, other]

SST-GNN: Simplified Spatio-temporal Traffic forecasting model using Graph Neural Network

Authors: Amit Roy, Kashob Kumar Roy, Amin Ahsan Ali, M Ashraful Amin, A K M Mahbubur Rahman

Abstract: To capture spatial relationships and temporal dynamics in traffic data, spatio-temporal models for traffic forecasting have drawn significant attention in recent years. Most of the recent works employed graph neural networks(GNN) with multiple layers to capture the spatial dependency. However, road junctions with different hop-distance can carry distinct traffic information which should be exploit… ▽ More To capture spatial relationships and temporal dynamics in traffic data, spatio-temporal models for traffic forecasting have drawn significant attention in recent years. Most of the recent works employed graph neural networks(GNN) with multiple layers to capture the spatial dependency. However, road junctions with different hop-distance can carry distinct traffic information which should be exploited separately but existing multi-layer GNNs are incompetent to discriminate between their impact. Again, to capture the temporal interrelationship, recurrent neural networks are common in state-of-the-art approaches that often fail to capture long-range dependencies. Furthermore, traffic data shows repeated patterns in a daily or weekly period which should be addressed explicitly. To address these limitations, we have designed a Simplified Spatio-temporal Traffic forecasting GNN(SST-GNN) that effectively encodes the spatial dependency by separately aggregating different neighborhood representations rather than with multiple layers and capture the temporal dependency with a simple yet effective weighted spatio-temporal aggregation mechanism. We capture the periodic traffic patterns by using a novel position encoding scheme with historical and current data in two different models. With extensive experimental analysis, we have shown that our model has significantly outperformed the state-of-the-art models on three real-world traffic datasets from the Performance Measurement System (PeMS). △ Less

Submitted 31 March, 2021; originally announced April 2021.

Comments: Accepted for publication in 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2021)

arXiv:2103.04279 [pdf, other]

doi 10.1007/978-3-030-75768-7_28

Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Recognition

Authors: M Tanjid Hasan Tonmoy, Saif Mahmud, A K M Mahbubur Rahman, M Ashraful Amin, Amin Ahsan Ali

Abstract: Wearable sensor based human activity recognition is a challenging problem due to difficulty in modeling spatial and temporal dependencies of sensor signals. Recognition models in closed-set assumption are forced to yield members of known activity classes as prediction. However, activity recognition models can encounter an unseen activity due to body-worn sensor malfunction or disability of the sub… ▽ More Wearable sensor based human activity recognition is a challenging problem due to difficulty in modeling spatial and temporal dependencies of sensor signals. Recognition models in closed-set assumption are forced to yield members of known activity classes as prediction. However, activity recognition models can encounter an unseen activity due to body-worn sensor malfunction or disability of the subject performing the activities. This problem can be addressed through modeling solution according to the assumption of open-set recognition. Hence, the proposed self attention based approach combines data hierarchically from different sensor placements across time to classify closed-set activities and it obtains notable performance improvement over state-of-the-art models on five publicly available datasets. The decoder in this autoencoder architecture incorporates self-attention based feature representations from encoder to detect unseen activity classes in open-set recognition setting. Furthermore, attention maps generated by the hierarchical model demonstrate explainable selection of features in activity recognition. We conduct extensive leave one subject out validation experiments that indicate significantly improved robustness to noise and subject specific variability in body-worn sensor signals. The source code is available at: github.com/saif-mahmud/hierarchical-attention-HAR △ Less

Submitted 7 March, 2021; originally announced March 2021.

Comments: Accepted for publication in 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2021)

arXiv:2102.04844 [pdf]

Contact Tracing Apps for COVID-19: Access Permission and User Adoption

Authors: Amal Awadalla Ali, Asma Hamid ElFadl, Maha Fawzy Abujazar, Sarah Aziz, Alaa Abd-Alrazaq, Zubair Shah, Samir Brahim Belhaouari, Mowafa Househ, Tanvir Alam

Abstract: Contact tracing apps are powerful software tools that can help control the spread of COVID-19. In this article, we evaluated 53 COVID-19 contact tracing apps found on the Google Play Store in terms of their usage, rating, access permission, and user privacy. For each app included in the study, we identified the country of origin, number of downloads, and access permissions to further understand th… ▽ More Contact tracing apps are powerful software tools that can help control the spread of COVID-19. In this article, we evaluated 53 COVID-19 contact tracing apps found on the Google Play Store in terms of their usage, rating, access permission, and user privacy. For each app included in the study, we identified the country of origin, number of downloads, and access permissions to further understand the attributes and ratings of the apps. Our results show that contact tracing apps had low overall ratings and nearly 40% of the included apps were requesting dangerous access permission including access to storage, media files, and camera permissions. We also found that user adoption rates were inversely correlated to access permission requirements. To the best of our knowledge, our article summarizes the most extensive collection of contact tracing apps for COVID-19. We recommend that future contact tracing apps should be more transparent in permission requirements and should provide justification for permissions requested to preserve the app users privacy. △ Less

Submitted 6 February, 2021; originally announced February 2021.

Comments: Contact Tracing Apps for COVID-19

arXiv:2012.01494 [pdf]

Braille to Text Translation for Bengali Language: A Geometric Approach

Authors: Minhas Kamal, Amin Ahsan Ali, Muhammad Asif Hossain Khan, Mohammad Shoyaib

Abstract: Braille is the only system to visually impaired people for reading and writing. However, general people cannot read Braille. So, teachers and relatives find it hard to assist them with learning. Almost every major language has software solutions for this translation purpose. However, in Bengali there is an absence of this useful tool. Here, we propose Braille to Text Translator, which takes image… ▽ More Braille is the only system to visually impaired people for reading and writing. However, general people cannot read Braille. So, teachers and relatives find it hard to assist them with learning. Almost every major language has software solutions for this translation purpose. However, in Bengali there is an absence of this useful tool. Here, we propose Braille to Text Translator, which takes image of these tactile alphabets, and translates them to plain text. Image deterioration, scan-time page rotation, and braille dot deformation are the principal issues in this scheme. All of these challenges are directly checked using special image processing and geometric structure analysis. The technique yields 97.25% accuracy in recognizing Braille characters. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Journal ref: In Jahangirnagar University Journal of Information Technology (JJIT), pp. 93-111, 2018

arXiv:2011.12847 [pdf, other]

Deep-learning coupled with novel classification method to classify the urban environment of the developing world

Authors: Qianwei Cheng, AKM Mahbubur Rahman, Anis Sarker, Abu Bakar Siddik Nayem, Ovi Paul, Amin Ahsan Ali, M Ashraful Amin, Ryosuke Shibasaki, Moinul Zaber

Abstract: Rapid globalization and the interdependence of humanity that engender tremendous in-flow of human migration towards the urban spaces. With advent of high definition satellite images, high resolution data, computational methods such as deep neural network, capable hardware; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high… ▽ More Rapid globalization and the interdependence of humanity that engender tremendous in-flow of human migration towards the urban spaces. With advent of high definition satellite images, high resolution data, computational methods such as deep neural network, capable hardware; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high-frequency data. In this paper we propose a novel classification method that is readily usable for machine analysis and show applicability of the methodology on a developing world setting. The state-of-the-art is mostly dominated by classification of building structures, building types etc. and largely represents the developed world which are insufficient for developing countries such as Bangladesh where the surrounding is crucial for the classification. Moreover, the traditional methods propose small-scale classifications, which give limited information with poor scalability and are slow to compute. We categorize the urban area in terms of informal and formal spaces taking the surroundings into account. 50 km x 50 km Google Earth image of Dhaka, Bangladesh was visually annotated and categorized by an expert. The classification is based broadly on two dimensions: urbanization and the architectural form of urban environment. Consequently, the urban space is divided into four classes: 1) highly informal; 2) moderately informal; 3) moderately formal; and 4) highly formal areas. In total 16 sub-classes were identified. For semantic segmentation, Google's DeeplabV3+ model was used which increases the field of view of the filters to incorporate larger context. Image encompassing 70% of the urban space was used for training and the remaining 30% was used for testing and validation. The model is able to segment with 75% accuracy and 60% Mean IoU. △ Less

Submitted 7 January, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

Comments: Accepted paper at 2nd International Conference on Signal Processing and Machine Learning (SIGML 2021); 20 pages, 7 figures, 1 table

arXiv:2011.02359 [pdf, other]

Modeling Traffic Congestion in Developing Countries using Google Maps Data

Authors: Md. Aktaruzzaman Pramanik, Md Mahbubur Rahman, ASM Iftekhar Anam, Amin Ahsan Ali, M Ashraful Amin, A K M Mahbubur Rahman

Abstract: Traffic congestion research is on the rise, thanks to urbanization, economic growth, and industrialization. Developed countries invest a lot of research money in collecting traffic data using Radio Frequency Identification (RFID), loop detectors, speed sensors, high-end traffic light, and GPS. However, these processes are expensive, infeasible, and non-scalable for developing countries with numero… ▽ More Traffic congestion research is on the rise, thanks to urbanization, economic growth, and industrialization. Developed countries invest a lot of research money in collecting traffic data using Radio Frequency Identification (RFID), loop detectors, speed sensors, high-end traffic light, and GPS. However, these processes are expensive, infeasible, and non-scalable for developing countries with numerous non-motorized vehicles, proliferated ride-sharing services, and frequent pedestrians. This paper proposes a novel approach to collect traffic data from Google Map's traffic layer with minimal cost. We have implemented widely used models such as Historical Averages (HA), Support Vector Regression (SVR), Support Vector Regression with Graph (SVR-Graph), Auto-Regressive Integrated Moving Average (ARIMA) to show the efficacy of the collected traffic data in forecasting future congestion. We show that even with these simple models, we could predict the traffic congestion ahead of time. We also demonstrate that the traffic patterns are significantly different between weekdays and weekends. △ Less

Submitted 29 October, 2020; originally announced November 2020.

arXiv:2003.09018 [pdf, other]

Human Activity Recognition from Wearable Sensor Data Using Self-Attention

Authors: Saif Mahmud, M Tanjid Hasan Tonmoy, Kishor Kumar Bhaumik, A K M Mahbubur Rahman, M Ashraful Amin, Mohammad Shoyaib, Muhammad Asif Hossain Khan, Amin Ahsan Ali

Abstract: Human Activity Recognition from body-worn sensor data poses an inherent challenge in capturing spatial and temporal dependencies of time-series signals. In this regard, the existing recurrent or convolutional or their hybrid models for activity recognition struggle to capture spatio-temporal context from the feature space of sensor reading sequence. To address this complex problem, we propose a se… ▽ More Human Activity Recognition from body-worn sensor data poses an inherent challenge in capturing spatial and temporal dependencies of time-series signals. In this regard, the existing recurrent or convolutional or their hybrid models for activity recognition struggle to capture spatio-temporal context from the feature space of sensor reading sequence. To address this complex problem, we propose a self-attention based neural network model that foregoes recurrent architectures and utilizes different types of attention mechanisms to generate higher dimensional feature representation used for classification. We performed extensive experiments on four popular publicly available HAR datasets: PAMAP2, Opportunity, Skoda and USC-HAD. Our model achieve significant performance improvement over recent state-of-the-art models in both benchmark test subjects and Leave-one-subject-out evaluation. We also observe that the sensor attention maps produced by our model is able capture the importance of the modality and placement of the sensors in predicting the different activity classes. △ Less

Submitted 17 March, 2020; originally announced March 2020.

Comments: Accepted for publication at the 24th European Conference on Artificial Intelligence (ECAI-2020); 8 pages, 4 figures

arXiv:1810.05041 [pdf, other]

doi 10.3390/e21080741

A General Framework for Fair Regression

Authors: Jack Fitzsimons, AbdulRahman Al Ali, Michael Osborne, Stephen Roberts

Abstract: Fairness, through its many forms and definitions, has become an important issue facing the machine learning community. In this work, we consider how to incorporate group fairness constraints in kernel regression methods, applicable to Gaussian processes, support vector machines, neural network regression and decision tree regression. Further, we focus on examining the effect of incorporating these… ▽ More Fairness, through its many forms and definitions, has become an important issue facing the machine learning community. In this work, we consider how to incorporate group fairness constraints in kernel regression methods, applicable to Gaussian processes, support vector machines, neural network regression and decision tree regression. Further, we focus on examining the effect of incorporating these constraints in decision tree regression, with direct applications to random forests and boosted trees amongst other widespread popular inference techniques. We show that the order of complexity of memory and computation is preserved for such models and tightly bound the expected perturbations to the model in terms of the number of leaves of the trees. Importantly, the approach works on trained models and hence can be easily applied to models in current use and group labels are only required on training data. △ Less

Submitted 2 February, 2019; v1 submitted 10 October, 2018; originally announced October 2018.

Comments: 8 pages, 4 figures, 2 pages references

arXiv:0706.4004 [pdf]

End-to-End Available Bandwidth Measurement Tools : A Comparative Evaluation of Performances

Authors: Ahmed Ait Ali, Fabien Michaut, Francis Lepage

Abstract: In recent years, there has been a strong interest in measuring the available bandwidth of network paths. Several methods and techniques have been proposed and various measurement tools have been developed and evaluated. However, there have been few comparative studies with regards to the actual performance of these tools. This paper presents a study of available bandwidth measurement techniques… ▽ More In recent years, there has been a strong interest in measuring the available bandwidth of network paths. Several methods and techniques have been proposed and various measurement tools have been developed and evaluated. However, there have been few comparative studies with regards to the actual performance of these tools. This paper presents a study of available bandwidth measurement techniques and undertakes a comparative analysis in terms of accuracy, intrusiveness and response time of active probing tools. Finally, measurement errors and the uncertainty of the tools are analysed and overall conclusions made. △ Less

Submitted 27 June, 2007; originally announced June 2007.

Journal ref: IPS-MoMe 2006 IEEE /ACM International workshop on Internet Performance, Simulation, Monitoring and Measurement, Autriche (27/02/2005) 13

Showing 1–23 of 23 results for author: Ali, A A