Search | arXiv e-print repository

Exploring Bengali Religious Dialect Biases in Large Language Models with Evaluation Perspectives

Authors: Azmine Toushik Wasi, Raima Islam, Mst Rafia Islam, Taki Hasan Rafi, Dong-Kyu Chae

Abstract: While Large Language Models (LLM) have created a massive technological impact in the past decade, allowing for human-enabled applications, they can produce output that contains stereotypes and biases, especially when using low-resource languages. This can be of great ethical concern when dealing with sensitive topics such as religion. As a means toward making LLMS more fair, we explore bias from a… ▽ More While Large Language Models (LLM) have created a massive technological impact in the past decade, allowing for human-enabled applications, they can produce output that contains stereotypes and biases, especially when using low-resource languages. This can be of great ethical concern when dealing with sensitive topics such as religion. As a means toward making LLMS more fair, we explore bias from a religious perspective in Bengali, focusing specifically on two main religious dialects: Hindu and Muslim-majority dialects. Here, we perform different experiments and audit showing the comparative analysis of different sentences using three commonly used LLMs: ChatGPT, Gemini, and Microsoft Copilot, pertaining to the Hindu and Muslim dialects of specific words and showcasing which ones catch the social biases and which do not. Furthermore, we analyze our findings and relate them to potential reasons and evaluation perspectives, considering their global impact with over 300 million speakers worldwide. With this work, we hope to establish the rigor for creating more fairness in LLMs, as these are widely used as creative writing agents. △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: 10 Pages, 4 Figures. Accepted to the 1st Human-centered Evaluation and Auditing of Language Models Workshop at CHI 2024 (Workshop website: https://heal-workshop.github.io/#:~:text=Exploring%20Bengali%20Religious%20Dialect%20Biases%20in%20Large%20Language%20Models%20with%20Evaluation%20Perspectives)

arXiv:2407.09828 [pdf, other]

Enhancing Semantic Segmentation with Adaptive Focal Loss: A Novel Approach

Authors: Md Rakibul Islam, Riad Hassan, Abdullah Nazib, Kien Nguyen, Clinton Fookes, Md Zahidul Islam

Abstract: Deep learning has achieved outstanding accuracy in medical image segmentation, particularly for objects like organs or tumors with smooth boundaries or large sizes. Whereas, it encounters significant difficulties with objects that have zigzag boundaries or are small in size, leading to a notable decrease in segmentation effectiveness. In this context, using a loss function that incorporates smooth… ▽ More Deep learning has achieved outstanding accuracy in medical image segmentation, particularly for objects like organs or tumors with smooth boundaries or large sizes. Whereas, it encounters significant difficulties with objects that have zigzag boundaries or are small in size, leading to a notable decrease in segmentation effectiveness. In this context, using a loss function that incorporates smoothness and volume information into a model's predictions offers a promising solution to these shortcomings. In this work, we introduce an Adaptive Focal Loss (A-FL) function designed to mitigate class imbalance by down-weighting the loss for easy examples that results in up-weighting the loss for hard examples and giving greater emphasis to challenging examples, such as small and irregularly shaped objects. The proposed A-FL involves dynamically adjusting a focusing parameter based on an object's surface smoothness, size information, and adjusting the class balancing parameter based on the ratio of targeted area to total area in an image. We evaluated the performance of the A-FL using ResNet50-encoded U-Net architecture on the Picai 2022 and BraTS 2018 datasets. On the Picai 2022 dataset, the A-FL achieved an Intersection over Union (IoU) of 0.696 and a Dice Similarity Coefficient (DSC) of 0.769, outperforming the regular Focal Loss (FL) by 5.5% and 5.4% respectively. It also surpassed the best baseline Dice-Focal by 2.0% and 1.2%. On the BraTS 2018 dataset, A-FL achieved an IoU of 0.883 and a DSC of 0.931. The comparative studies show that the proposed A-FL function surpasses conventional methods, including Dice Loss, Focal Loss, and their hybrid variants, in IoU, DSC, Sensitivity, and Specificity metrics. This work highlights A-FL's potential to improve deep learning models for segmenting clinically significant regions in medical images, leading to more precise and reliable diagnostic tools. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 15 pages, 4 figures

arXiv:2407.07361 [pdf, other]

Characterizing Encrypted Application Traffic through Cellular Radio Interface Protocol

Authors: Md Ruman Islam, Raja Hasnain Anwar, Spyridon Mastorakis, Muhammad Taqi Raza

Abstract: Modern applications are end-to-end encrypted to prevent data from being read or secretly modified. 5G tech nology provides ubiquitous access to these applications without compromising the application-specific performance and latency goals. In this paper, we empirically demonstrate that 5G radio communication becomes the side channel to precisely infer the user's applications in real-time. The key… ▽ More Modern applications are end-to-end encrypted to prevent data from being read or secretly modified. 5G tech nology provides ubiquitous access to these applications without compromising the application-specific performance and latency goals. In this paper, we empirically demonstrate that 5G radio communication becomes the side channel to precisely infer the user's applications in real-time. The key idea lies in observing the 5G physical and MAC layer interactions over time that reveal the application's behavior. The MAC layer receives the data from the application and requests the network to assign the radio resource blocks. The network assigns the radio resources as per application requirements, such as priority, Quality of Service (QoS) needs, amount of data to be transmitted, and buffer size. The adversary can passively observe the radio resources to fingerprint the applications. We empirically demonstrate this attack by considering four different categories of applications: online shopping, voice/video conferencing, video streaming, and Over-The-Top (OTT) media platforms. Finally, we have also demonstrated that an attacker can differentiate various types of applications in real-time within each category. △ Less

Submitted 20 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: 9 pages, 8 figures, 2 tables. This paper has been accepted for publication by the 21st IEEE International Conference on Mobile Ad-Hoc and Smart Systems (MASS 2024)

arXiv:2407.05467 [pdf, other]

The infrastructure powering IBM's Gen AI model development

Authors: Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba , et al. (121 additional authors not shown)

Abstract: AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi… ▽ More AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Corresponding Authors: Talia Gershon, Seetharami Seelam,Brian Belgodere, Milton Bonilla

arXiv:2407.03154 [pdf, other]

Reinforcement Learning for Sequence Design Leveraging Protein Language Models

Authors: Jithendaraa Subramanian, Shivakanth Sujit, Niloy Irtisam, Umong Sain, Derek Nowrouzezahrai, Samira Ebrahimi Kahou, Riashat Islam

Abstract: Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large se… ▽ More Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large search spaces, learning a mutation policy to generate novel sequences with reinforcement learning is appealing. Recent advances in protein language models (PLMs) trained on large corpora of protein sequences offer a potential solution to this problem by scoring proteins according to their biological plausibility (such as the TM-score). In this work, we propose to use PLMs as a reward function to generate new sequences. Yet the PLM can be computationally expensive to query due to its large size. To this end, we propose an alternative paradigm where optimization can be performed on scores from a smaller proxy model that is periodically finetuned, jointly while learning the mutation policy. We perform extensive experiments on various sequence lengths to benchmark RL-based approaches, and provide comprehensive evaluations along biological plausibility and diversity of the protein. Our experimental results include favorable evaluations of the proposed sequences, along with high diversity scores, demonstrating that RL is a strong candidate for biological sequence design. Finally, we provide a modular open source implementation can be easily integrated in most RL training loops, with support for replacing the reward model with other PLMs, to spur further research in this domain. The code for all experiments is provided in the supplementary material. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 22 pages, 7 figures, 4 tables

arXiv:2407.01011 [pdf]

Parallel Computing Architectures for Robotic Applications: A Comprehensive Review

Authors: Md Rafid Islam

Abstract: With the growing complexity and capability of contemporary robotic systems, the necessity of sophisticated computing solutions to efficiently handle tasks such as real-time processing, sensor integration, decision-making, and control algorithms is also increasing. Conventional serial computing frequently fails to meet these requirements, underscoring the necessity for high-performance computing al… ▽ More With the growing complexity and capability of contemporary robotic systems, the necessity of sophisticated computing solutions to efficiently handle tasks such as real-time processing, sensor integration, decision-making, and control algorithms is also increasing. Conventional serial computing frequently fails to meet these requirements, underscoring the necessity for high-performance computing alternatives. Parallel computing, the utilization of several processing elements simultaneously to solve computational problems, offers a possible answer. Various parallel computing designs, such as multi-core CPUs, GPUs, FPGAs, and distributed systems, provide substantial enhancements in processing capacity and efficiency. By utilizing these architectures, robotic systems can attain improved performance in functionalities such as real-time image processing, sensor fusion, and path planning. The transformative potential of parallel computing architectures in advancing robotic technology has been underscored, real-life case studies of these architectures in the robotics field have been discussed, and comparisons are presented. Challenges pertaining to these architectures have been explored, and possible solutions have been mentioned for further research and enhancement of the robotic applications. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.17181 [pdf, other]

FacePsy: An Open-Source Affective Mobile Sensing System -- Analyzing Facial Behavior and Head Gesture for Depression Detection in Naturalistic Settings

Authors: Rahul Islam, Sang Won Bae

Abstract: Depression, a prevalent and complex mental health issue affecting millions worldwide, presents significant challenges for detection and monitoring. While facial expressions have shown promise in laboratory settings for identifying depression, their potential in real-world applications remains largely unexplored due to the difficulties in developing efficient mobile systems. In this study, we aim t… ▽ More Depression, a prevalent and complex mental health issue affecting millions worldwide, presents significant challenges for detection and monitoring. While facial expressions have shown promise in laboratory settings for identifying depression, their potential in real-world applications remains largely unexplored due to the difficulties in developing efficient mobile systems. In this study, we aim to introduce FacePsy, an open-source mobile sensing system designed to capture affective inferences by analyzing sophisticated features and generating real-time data on facial behavior landmarks, eye movements, and head gestures -- all within the naturalistic context of smartphone usage with 25 participants. Through rigorous development, testing, and optimization, we identified eye-open states, head gestures, smile expressions, and specific Action Units (2, 6, 7, 12, 15, and 17) as significant indicators of depressive episodes (AUROC=81%). Our regression model predicting PHQ-9 scores achieved moderate accuracy, with a Mean Absolute Error of 3.08. Our findings offer valuable insights and implications for enhancing deployable and usable mobile affective sensing systems, ultimately improving mental health monitoring, prediction, and just-in-time adaptive interventions for researchers and developers in healthcare. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: Accepted to ACM International Conference on Mobile Human-Computer Interaction (MobileHCI 2024)

arXiv:2406.15942 [pdf, other]

Revolutionizing Mental Health Support: An Innovative Affective Mobile Framework for Dynamic, Proactive, and Context-Adaptive Conversational Agents

Authors: Rahul Islam, Sang Won Bae

Abstract: As we build towards developing interactive systems that can recognize human emotional states and respond to individual needs more intuitively and empathetically in more personalized and context-aware computing time. This is especially important regarding mental health support, with a rising need for immediate, non-intrusive help tailored to each individual. Individual mental health and the complex… ▽ More As we build towards developing interactive systems that can recognize human emotional states and respond to individual needs more intuitively and empathetically in more personalized and context-aware computing time. This is especially important regarding mental health support, with a rising need for immediate, non-intrusive help tailored to each individual. Individual mental health and the complex nature of human emotions call for novel approaches beyond conventional proactive and reactive-based chatbot approaches. In this position paper, we will explore how to create Chatbots that can sense, interpret, and intervene in emotional signals by combining real-time facial expression analysis, physiological signal interpretation, and language models. This is achieved by incorporating facial affect detection into existing practical and ubiquitous passive sensing contexts, thus empowering them with the capabilities to the ubiquity of sensing behavioral primitives to recognize, interpret, and respond to human emotions. In parallel, the system employs cognitive-behavioral therapy tools such as cognitive reframing and mood journals, leveraging the therapeutic intervention potential of Chatbots in mental health contexts. Finally, we propose a project to build a system that enhances the emotional understanding of Chatbots to engage users in chat-based intervention, thereby helping manage their mood. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: Accepted to Ubicomp '23, GenAI4PC Symposium

arXiv:2405.20313 [pdf, other]

Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation

Authors: Guillaume Huguet, James Vuckovic, Kilian Fatras, Eric Thibodeau-Laufer, Pablo Lemos, Riashat Islam, Cheng-Hao Liu, Jarrid Rector-Brooks, Tara Akhound-Sadegh, Michael Bronstein, Alexander Tong, Avishek Joey Bose

Abstract: Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFl… ▽ More Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFlow-2 presents substantial new architectural features over the previous FoldFlow family of models including a protein large language model to encode sequence, a new multi-modal fusion trunk that combines structure and sequence representations, and a geometric transformer based decoder. To increase diversity and novelty of generated samples -- crucial for de-novo drug design -- we train FoldFlow-2 at scale on a new dataset that is an order of magnitude larger than PDB datasets of prior works, containing both known proteins in PDB and high-quality synthetic structures achieved through filtering. We further demonstrate the ability to align FoldFlow-2 to arbitrary rewards, e.g. increasing secondary structures diversity, by introducing a Reinforced Finetuning (ReFT) objective. We empirically observe that FoldFlow-2 outperforms previous state-of-the-art protein structure-based generative models, improving over RFDiffusion in terms of unconditional generation across all metrics including designability, diversity, and novelty across all protein lengths, as well as exhibiting generalization on the task of equilibrium conformation sampling. Finally, we demonstrate that a fine-tuned FoldFlow-2 makes progress on challenging conditional design tasks such as designing scaffolds for the VHH nanobody. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: preprint

arXiv:2405.08359 [pdf, other]

GPS-IDS: An Anomaly-based GPS Spoofing Attack Detection Framework for Autonomous Vehicles

Authors: Murad Mehrab Abrar, Raian Islam, Shalaka Satam, Sicong Shao, Salim Hariri, Pratik Satam

Abstract: Autonomous Vehicles (AVs) heavily rely on sensors and communication networks like Global Positioning System (GPS) to navigate autonomously. Prior research has indicated that networks like GPS are vulnerable to cyber-attacks such as spoofing and jamming, thus posing serious risks like navigation errors and system failures. These threats are expected to intensify with the widespread deployment of AV… ▽ More Autonomous Vehicles (AVs) heavily rely on sensors and communication networks like Global Positioning System (GPS) to navigate autonomously. Prior research has indicated that networks like GPS are vulnerable to cyber-attacks such as spoofing and jamming, thus posing serious risks like navigation errors and system failures. These threats are expected to intensify with the widespread deployment of AVs, making it crucial to detect and mitigate such attacks. This paper proposes GPS Intrusion Detection System, or GPS-IDS, an Anomaly Behavior Analysis (ABA)-based intrusion detection framework to detect GPS spoofing attacks on AVs. The framework uses a novel physics-based vehicle behavior model where a GPS navigation model is integrated into the conventional dynamic bicycle model for accurate AV behavior representation. Temporal features derived from this behavior model are analyzed using machine learning to detect normal and abnormal navigation behavior. The performance of the GPS-IDS framework is evaluated on the AV-GPS-Dataset - a real-world dataset collected by the team using an AV testbed. The dataset has been publicly released for the global research community. To the best of our knowledge, this dataset is the first of its kind and will serve as a useful resource to address such security challenges. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Article under review at IEEE Transactions on Dependable and Secure Computing. For associated AV-GPS-Dataset, see https://github.com/mehrab-abrar/AV-GPS-Dataset

arXiv:2405.06667 [pdf, other]

Sentiment Polarity Analysis of Bangla Food Reviews Using Machine and Deep Learning Algorithms

Authors: Al Amin, Anik Sarkar, Md Mahamodul Islam, Asif Ahammad Miazee, Md Robiul Islam, Md Mahmudul Hoque

Abstract: The Internet has become an essential tool for people in the modern world. Humans, like all living organisms, have essential requirements for survival. These include access to atmospheric oxygen, potable water, protective shelter, and sustenance. The constant flux of the world is making our existence less complicated. A significant portion of the population utilizes online food ordering services to… ▽ More The Internet has become an essential tool for people in the modern world. Humans, like all living organisms, have essential requirements for survival. These include access to atmospheric oxygen, potable water, protective shelter, and sustenance. The constant flux of the world is making our existence less complicated. A significant portion of the population utilizes online food ordering services to have meals delivered to their residences. Although there are numerous methods for ordering food, customers sometimes experience disappointment with the food they receive. Our endeavor was to establish a model that could determine if food is of good or poor quality. We compiled an extensive dataset of over 1484 online reviews from prominent food ordering platforms, including Food Panda and HungryNaki. Leveraging the collected data, a rigorous assessment of various deep learning and machine learning techniques was performed to determine the most accurate approach for predicting food quality. Out of all the algorithms evaluated, logistic regression emerged as the most accurate, achieving an impressive 90.91% accuracy. The review offers valuable insights that will guide the user in deciding whether or not to order the food. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.06263 [pdf, other]

Learning Latent Dynamic Robust Representations for World Models

Authors: Ruixiang Sun, Hongyu Zang, Xin Li, Riashat Islam

Abstract: Visual Model-Based Reinforcement Learning (MBRL) promises to encapsulate agent's knowledge about the underlying dynamics of the environment, enabling learning a world model as a useful planner. However, top MBRL agents such as Dreamer often struggle with visual pixel-based inputs in the presence of exogenous or irrelevant noise in the observation space, due to failure to capture task-specific feat… ▽ More Visual Model-Based Reinforcement Learning (MBRL) promises to encapsulate agent's knowledge about the underlying dynamics of the environment, enabling learning a world model as a useful planner. However, top MBRL agents such as Dreamer often struggle with visual pixel-based inputs in the presence of exogenous or irrelevant noise in the observation space, due to failure to capture task-specific features while filtering out irrelevant spatio-temporal details. To tackle this problem, we apply a spatio-temporal masking strategy, a bisimulation principle, combined with latent reconstruction, to capture endogenous task-specific aspects of the environment for world models, effectively eliminating non-essential information. Joint training of representations, dynamics, and policy often leads to instabilities. To further address this issue, we develop a Hybrid Recurrent State-Space Model (HRSSM) structure, enhancing state representation robustness for effective policy learning. Our empirical evaluation demonstrates significant performance improvements over existing methods in a range of visually complex control tasks such as Maniskill \cite{gu2023maniskill2} with exogenous distractors from the Matterport environment. Our code is avaliable at https://github.com/bit1029public/HRSSM. △ Less

Submitted 30 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Journal ref: ICML 2024

arXiv:2405.01873 [pdf, other]

Enhancing Bangla Language Next Word Prediction and Sentence Completion through Extended RNN with Bi-LSTM Model On N-gram Language

Authors: Md Robiul Islam, Al Amin, Aniqua Nusrat Zereen

Abstract: Texting stands out as the most prominent form of communication worldwide. Individual spend significant amount of time writing whole texts to send emails or write something on social media, which is time consuming in this modern era. Word prediction and sentence completion will be suitable and appropriate in the Bangla language to make textual information easier and more convenient. This paper expa… ▽ More Texting stands out as the most prominent form of communication worldwide. Individual spend significant amount of time writing whole texts to send emails or write something on social media, which is time consuming in this modern era. Word prediction and sentence completion will be suitable and appropriate in the Bangla language to make textual information easier and more convenient. This paper expands the scope of Bangla language processing by introducing a Bi-LSTM model that effectively handles Bangla next-word prediction and Bangla sentence generation, demonstrating its versatility and potential impact. We proposed a new Bi-LSTM model to predict a following word and complete a sentence. We constructed a corpus dataset from various news portals, including bdnews24, BBC News Bangla, and Prothom Alo. The proposed approach achieved superior results in word prediction, reaching 99\% accuracy for both 4-gram and 5-gram word predictions. Moreover, it demonstrated significant improvement over existing methods, achieving 35\%, 75\%, and 95\% accuracy for uni-gram, bi-gram, and tri-gram word prediction, respectively △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: This paper contains 6 pages, 8 figures

arXiv:2404.17960 [pdf, other]

PhishGuard: A Convolutional Neural Network Based Model for Detecting Phishing URLs with Explainability Analysis

Authors: Md Robiul Islam, Md Mahamodul Islam, Mst. Suraiya Afrin, Anika Antara, Nujhat Tabassum, Al Amin

Abstract: Cybersecurity is one of the global issues because of the extensive dependence on cyber systems of individuals, industries, and organizations. Among the cyber attacks, phishing is increasing tremendously and affecting the global economy. Therefore, this phenomenon highlights the vital need for enhancing user awareness and robust support at both individual and organizational levels. Phishing URL ide… ▽ More Cybersecurity is one of the global issues because of the extensive dependence on cyber systems of individuals, industries, and organizations. Among the cyber attacks, phishing is increasing tremendously and affecting the global economy. Therefore, this phenomenon highlights the vital need for enhancing user awareness and robust support at both individual and organizational levels. Phishing URL identification is the best way to address the problem. Various machine learning and deep learning methods have been proposed to automate the detection of phishing URLs. However, these approaches often need more convincing accuracy and rely on datasets consisting of limited samples. Furthermore, these black box intelligent models decision to detect suspicious URLs needs proper explanation to understand the features affecting the output. To address the issues, we propose a 1D Convolutional Neural Network (CNN) and trained the model with extensive features and a substantial amount of data. The proposed model outperforms existing works by attaining an accuracy of 99.85%. Additionally, our explainability analysis highlights certain features that significantly contribute to identifying the phishing URL. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 6 pages

arXiv:2404.14590 [pdf, other]

PupilSense: Detection of Depressive Episodes Through Pupillary Response in the Wild

Authors: Rahul Islam, Sang Won Bae

Abstract: Early detection of depressive episodes is crucial in managing mental health disorders such as Major Depressive Disorder (MDD) and Bipolar Disorder. However, existing methods often necessitate active participation or are confined to clinical settings. Addressing this gap, we introduce PupilSense, a novel, deep learning-driven mobile system designed to discreetly track pupillary responses as users i… ▽ More Early detection of depressive episodes is crucial in managing mental health disorders such as Major Depressive Disorder (MDD) and Bipolar Disorder. However, existing methods often necessitate active participation or are confined to clinical settings. Addressing this gap, we introduce PupilSense, a novel, deep learning-driven mobile system designed to discreetly track pupillary responses as users interact with their smartphones in their daily lives. This study presents a proof-of-concept exploration of PupilSense's capabilities, where we captured real-time pupillary data from users in naturalistic settings. Our findings indicate that PupilSense can effectively and passively monitor indicators of depressive episodes, offering a promising tool for continuous mental health assessment outside laboratory environments. This advancement heralds a significant step in leveraging ubiquitous mobile technology for proactive mental health care, potentially transforming how depressive episodes are detected and managed in everyday contexts. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 2024 International Conference on Activity and Behavior Computing

arXiv:2404.14552 [pdf, other]

Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs

Authors: Lili Wu, Ben Evans, Riashat Islam, Raihan Seraj, Yonathan Efroni, Alex Lamb

Abstract: Discovering an informative, or agent-centric, state representation that encodes only the relevant information while discarding the irrelevant is a key challenge towards scaling reinforcement learning algorithms and efficiently applying them to downstream tasks. Prior works studied this problem in high-dimensional Markovian environments, when the current observation may be a complex object but is s… ▽ More Discovering an informative, or agent-centric, state representation that encodes only the relevant information while discarding the irrelevant is a key challenge towards scaling reinforcement learning algorithms and efficiently applying them to downstream tasks. Prior works studied this problem in high-dimensional Markovian environments, when the current observation may be a complex object but is sufficient to decode the informative state. In this work, we consider the problem of discovering the agent-centric state in the more challenging high-dimensional non-Markovian setting, when the state can be decoded from a sequence of past observations. We establish that generalized inverse models can be adapted for learning agent-centric state representation for this task. Our results include asymptotic theory in the deterministic dynamics setting as well as counter-examples for alternative intuitive algorithms. We complement these findings with a thorough empirical study on the agent-centric state discovery abilities of the different alternatives we put forward. Particularly notable is our analysis of past actions, where we show that these can be a double-edged sword: making the algorithms more successful when used correctly and causing dramatic failure when used incorrectly. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.12473 [pdf, other]

Explainable Deep Learning Models for Dynamic and Online Malware Classification

Authors: Quincy Card, Daniel Simpson, Kshitiz Aryal, Maanak Gupta, Sheikh Rabiul Islam

Abstract: In recent years, there has been a significant surge in malware attacks, necessitating more advanced preventive measures and remedial strategies. While several successful AI-based malware classification approaches exist categorized into static, dynamic, or online analysis, most successful AI models lack easily interpretable decisions and explanations for their processes. Our paper aims to delve int… ▽ More In recent years, there has been a significant surge in malware attacks, necessitating more advanced preventive measures and remedial strategies. While several successful AI-based malware classification approaches exist categorized into static, dynamic, or online analysis, most successful AI models lack easily interpretable decisions and explanations for their processes. Our paper aims to delve into explainable malware classification across various execution environments (such as dynamic and online), thoroughly analyzing their respective strengths, weaknesses, and commonalities. To evaluate our approach, we train Feed Forward Neural Networks (FFNN) and Convolutional Neural Networks (CNN) to classify malware based on features obtained from dynamic and online analysis environments. The feature attribution for malware classification is performed by explainability tools, SHAP, LIME and Permutation Importance. We perform a detailed evaluation of the calculated global and local explanations from the experiments, discuss limitations and, ultimately, offer recommendations for achieving a balanced approach. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.03528 [pdf, other]

BanglaAutoKG: Automatic Bangla Knowledge Graph Construction with Semantic Neural Graph Filtering

Authors: Azmine Toushik Wasi, Taki Hasan Rafi, Raima Islam, Dong-Kyu Chae

Abstract: Knowledge Graphs (KGs) have proven essential in information processing and reasoning applications because they link related entities and give context-rich information, supporting efficient information retrieval and knowledge discovery; presenting information flow in a very effective manner. Despite being widely used globally, Bangla is relatively underrepresented in KGs due to a lack of comprehens… ▽ More Knowledge Graphs (KGs) have proven essential in information processing and reasoning applications because they link related entities and give context-rich information, supporting efficient information retrieval and knowledge discovery; presenting information flow in a very effective manner. Despite being widely used globally, Bangla is relatively underrepresented in KGs due to a lack of comprehensive datasets, encoders, NER (named entity recognition) models, POS (part-of-speech) taggers, and lemmatizers, hindering efficient information processing and reasoning applications in the language. Addressing the KG scarcity in Bengali, we propose BanglaAutoKG, a pioneering framework that is able to automatically construct Bengali KGs from any Bangla text. We utilize multilingual LLMs to understand various languages and correlate entities and relations universally. By employing a translation dictionary to identify English equivalents and extracting word features from pre-trained BERT models, we construct the foundational KG. To reduce noise and align word embeddings with our goal, we employ graph-based polynomial filters. Lastly, we implement a GNN-based semantic filter, which elevates contextual understanding and trims unnecessary edges, culminating in the formation of the definitive KG. Empirical findings and case studies demonstrate the universal effectiveness of our model, capable of autonomously constructing semantically enriched KGs from any text. △ Less

Submitted 5 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 7 pages, 3 figures. Accepted to LREC-COLING 2024. Read in ACL Anthology: https://aclanthology.org/2024.lrec-main.189/

Journal ref: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

arXiv:2404.00297 [pdf, other]

TRABSA: Interpretable Sentiment Analysis of Tweets using Attention-based BiLSTM and Twitter-RoBERTa

Authors: Md Abrar Jahin, Md Sakib Hossain Shovon, M. F. Mridha, Md Rashedul Islam, Yutaka Watanobe

Abstract: Sentiment analysis is crucial for understanding public opinion and consumer behavior. Existing models face challenges with linguistic diversity, generalizability, and explainability. We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this. Leveraging RoBERTa-trained on 124M tweets, we bridge gaps in sentiment anal… ▽ More Sentiment analysis is crucial for understanding public opinion and consumer behavior. Existing models face challenges with linguistic diversity, generalizability, and explainability. We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this. Leveraging RoBERTa-trained on 124M tweets, we bridge gaps in sentiment analysis benchmarks, ensuring state-of-the-art accuracy. Augmenting datasets with tweets from 32 countries and US states, we compare six word-embedding techniques and three lexicon-based labeling techniques, selecting the best for optimal sentiment analysis. TRABSA outperforms traditional ML and deep learning models with 94% accuracy and significant precision, recall, and F1-score gains. Evaluation across diverse datasets demonstrates consistent superiority and generalizability. SHAP and LIME analyses enhance interpretability, improving confidence in predictions. Our study facilitates pandemic resource management, aiding resource planning, policy formation, and vaccination tactics. △ Less

Submitted 16 May, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

arXiv:2404.00027 [pdf, other]

LLMs as Writing Assistants: Exploring Perspectives on Sense of Ownership and Reasoning

Authors: Azmine Toushik Wasi, Mst Rafia Islam, Raima Islam

Abstract: Sense of ownership in writing confines our investment of thoughts, time, and contribution, leading to attachment to the output. However, using writing assistants introduces a mental dilemma, as some content isn't directly our creation. For instance, we tend to credit Large Language Models (LLMs) more in creative tasks, even though all tasks are equal for them. Additionally, while we may not claim… ▽ More Sense of ownership in writing confines our investment of thoughts, time, and contribution, leading to attachment to the output. However, using writing assistants introduces a mental dilemma, as some content isn't directly our creation. For instance, we tend to credit Large Language Models (LLMs) more in creative tasks, even though all tasks are equal for them. Additionally, while we may not claim complete ownership of LLM-generated content, we freely claim authorship. We conduct a short survey to examine these issues and understand underlying cognitive processes in order to gain a better knowledge of human-computer interaction in writing and improve writing aid systems. △ Less

Submitted 27 July, 2024; v1 submitted 20 March, 2024; originally announced April 2024.

Comments: 8 Pages, 3 Figures. Accepted in The Third Workshop on Intelligent and Interactive Writing Assistants at CHI 2024

arXiv:2404.00026 [pdf, other]

Ink and Individuality: Crafting a Personalised Narrative in the Age of LLMs

Authors: Azmine Toushik Wasi, Raima Islam, Mst Rafia Islam

Abstract: Individuality and personalization comprise the distinctive characteristics that make each writer unique and influence their words in order to effectively engage readers while conveying authenticity. However, our growing reliance on LLM-based writing assistants risks compromising our creativity and individuality over time. We often overlook the negative impacts of this trend on our creativity and u… ▽ More Individuality and personalization comprise the distinctive characteristics that make each writer unique and influence their words in order to effectively engage readers while conveying authenticity. However, our growing reliance on LLM-based writing assistants risks compromising our creativity and individuality over time. We often overlook the negative impacts of this trend on our creativity and uniqueness, despite the possible consequences. This study investigates these concerns by performing a brief survey to explore different perspectives and concepts, as well as trying to understand people's viewpoints, in conjunction with past studies in the area. Addressing these issues is essential for improving human-computer interaction systems and enhancing writing assistants for personalization and individuality. △ Less

Submitted 27 July, 2024; v1 submitted 20 March, 2024; originally announced April 2024.

Comments: 8 Pages, 4 Figures. Accepted in The Third Workshop on Intelligent and Interactive Writing Assistants at CHI 2024

arXiv:2403.17210 [pdf, other]

CADGL: Context-Aware Deep Graph Learning for Predicting Drug-Drug Interactions

Authors: Azmine Toushik Wasi, Taki Hasan Rafi, Raima Islam, Serbetar Karlo, Dong-Kyu Chae

Abstract: Examining Drug-Drug Interactions (DDIs) is a pivotal element in the process of drug development. DDIs occur when one drug's properties are affected by the inclusion of other drugs. Detecting favorable DDIs has the potential to pave the way for creating and advancing innovative medications applicable in practical settings. However, existing DDI prediction models continue to face challenges related… ▽ More Examining Drug-Drug Interactions (DDIs) is a pivotal element in the process of drug development. DDIs occur when one drug's properties are affected by the inclusion of other drugs. Detecting favorable DDIs has the potential to pave the way for creating and advancing innovative medications applicable in practical settings. However, existing DDI prediction models continue to face challenges related to generalization in extreme cases, robust feature extraction, and real-life application possibilities. We aim to address these challenges by leveraging the effectiveness of context-aware deep graph learning by introducing a novel framework named CADGL. Based on a customized variational graph autoencoder (VGAE), we capture critical structural and physio-chemical information using two context preprocessors for feature extraction from two different perspectives: local neighborhood and molecular context, in a heterogeneous graphical structure. Our customized VGAE consists of a graph encoder, a latent information encoder, and an MLP decoder. CADGL surpasses other state-of-the-art DDI prediction models, excelling in predicting clinically valuable novel DDIs, supported by rigorous case studies. △ Less

Submitted 27 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: 8 Pages, 4 Figures; In review

arXiv:2403.12984 [pdf, other]

When SMILES have Language: Drug Classification using Text Classification Methods on Drug SMILES Strings

Authors: Azmine Toushik Wasi, Šerbetar Karlo, Raima Islam, Taki Hasan Rafi, Dong-Kyu Chae

Abstract: Complex chemical structures, like drugs, are usually defined by SMILES strings as a sequence of molecules and bonds. These SMILES strings are used in different complex machine learning-based drug-related research and representation works. Escaping from complex representation, in this work, we pose a single question: What if we treat drug SMILES as conventional sentences and engage in text classifi… ▽ More Complex chemical structures, like drugs, are usually defined by SMILES strings as a sequence of molecules and bonds. These SMILES strings are used in different complex machine learning-based drug-related research and representation works. Escaping from complex representation, in this work, we pose a single question: What if we treat drug SMILES as conventional sentences and engage in text classification for drug classification? Our experiments affirm the possibility with very competitive scores. The study explores the notion of viewing each atom and bond as sentence components, employing basic NLP methods to categorize drug types, proving that complex problems can also be solved with simpler perspectives. The data and code are available here: https://github.com/azminewasi/Drug-Classification-NLP. △ Less

Submitted 27 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

Comments: 7 pages, 2 figures, 5 tables, Accepted (invited to present) to the The Second Tiny Papers Track at ICLR 2024 (https://openreview.net/forum?id=VUYCyH8fCw)

Journal ref: The Second Tiny Papers Track at {ICLR} 2024, Tiny Papers @ {ICLR} 2024, Vienna Austria, May 11, 2024

arXiv:2403.02665 [pdf, other]

DGAP: Efficient Dynamic Graph Analysis on Persistent Memory

Authors: Abdullah Al Raqibul Islam, Dong Dai

Abstract: Dynamic graphs, featuring continuously updated vertices and edges, have grown in importance for numerous real-world applications. To accommodate this, graph frameworks, particularly their internal data structures, must support both persistent graph updates and rapid graph analysis simultaneously, leading to complex designs to orchestrate `fast but volatile' and `persistent but slow' storage device… ▽ More Dynamic graphs, featuring continuously updated vertices and edges, have grown in importance for numerous real-world applications. To accommodate this, graph frameworks, particularly their internal data structures, must support both persistent graph updates and rapid graph analysis simultaneously, leading to complex designs to orchestrate `fast but volatile' and `persistent but slow' storage devices. Emerging persistent memory technologies, such as Optane DCPMM, offer a promising alternative to simplify the designs by providing data persistence, low latency, and high IOPS together. In light of this, we propose DGAP, a framework for efficient dynamic graph analysis on persistent memory. Unlike traditional dynamic graph frameworks, which combine multiple graph data structures (e.g., edge list or adjacency list) to achieve the required performance, DGAP utilizes a single mutable Compressed Sparse Row (CSR) graph structure with new designs for persistent memory to construct the framework. Specifically, DGAP introduces a \textit{per-section edge log} to reduce write amplification on persistent memory; a \textit{per-thread undo log} to enable high-performance, crash-consistent rebalancing operations; and a data placement schema to minimize in-place updates on persistent memory. Our extensive evaluation results demonstrate that DGAP can achieve up to $3.2\times$ better graph update performance and up to $3.77\times$ better graph analysis performance compared to state-of-the-art dynamic graph frameworks for persistent memory, such as XPGraph, LLAMA, and GraphOne. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2402.13495 [pdf, other]

Can One Embedding Fit All? A Multi-Interest Learning Paradigm Towards Improving User Interest Diversity Fairness

Authors: Yuying Zhao, Minghua Xu, Huiyuan Chen, Yuzhong Chen, Yiwei Cai, Rashidul Islam, Yu Wang, Tyler Derr

Abstract: Recommender systems (RSs) have gained widespread applications across various domains owing to the superior ability to capture users' interests. However, the complexity and nuanced nature of users' interests, which span a wide range of diversity, pose a significant challenge in delivering fair recommendations. In practice, user preferences vary significantly; some users show a clear preference towa… ▽ More Recommender systems (RSs) have gained widespread applications across various domains owing to the superior ability to capture users' interests. However, the complexity and nuanced nature of users' interests, which span a wide range of diversity, pose a significant challenge in delivering fair recommendations. In practice, user preferences vary significantly; some users show a clear preference toward certain item categories, while others have a broad interest in diverse ones. Even though it is expected that all users should receive high-quality recommendations, the effectiveness of RSs in catering to this disparate interest diversity remains under-explored. In this work, we investigate whether users with varied levels of interest diversity are treated fairly. Our empirical experiments reveal an inherent disparity: users with broader interests often receive lower-quality recommendations. To mitigate this, we propose a multi-interest framework that uses multiple (virtual) interest embeddings rather than single ones to represent users. Specifically, the framework consists of stacked multi-interest representation layers, which include an interest embedding generator that derives virtual interests from shared parameters, and a center embedding aggregator that facilitates multi-hop aggregation. Experiments demonstrate the effectiveness of the framework in achieving better trade-off between fairness and utility across various datasets and backbones. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: Accepted by WWW'24

arXiv:2402.07832 [pdf]

Best Practices for Facing the Security Challenges of Internet of Things Devices Focusing on Software Development Life Cycle

Authors: Md Rafid Islam, Ratun Rahman

Abstract: In the past few years, the number of IoT devices has grown substantially, and this trend is likely to continue. An increasing amount of effort is being put into developing software for the ever-increasing IoT devices. Every IoT system at its core has software that enables the devices to function efficiently. But security has always been a concern in this age of information and technology. Security… ▽ More In the past few years, the number of IoT devices has grown substantially, and this trend is likely to continue. An increasing amount of effort is being put into developing software for the ever-increasing IoT devices. Every IoT system at its core has software that enables the devices to function efficiently. But security has always been a concern in this age of information and technology. Security for IoT devices is now a top priority due to the growing number of threats. This study introduces best practices for ensuring security in the IoT, with an emphasis on guidelines to be utilized in software development for IoT devices. The objective of the study is to raise awareness of potential threats, emphasizing the secure software development lifecycle. The study will also serve as a point of reference for future developments and provide a solid foundation for securing IoT software and dealing with vulnerabilities. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.07444 [pdf, other]

doi 10.1145/3589334.3645543

Malicious Package Detection using Metadata Information

Authors: S. Halder, M. Bewong, A. Mahboubi, Y. Jiang, R. Islam, Z. Islam, R. Ip, E. Ahmed, G. Ramachandran, A. Babar

Abstract: Protecting software supply chains from malicious packages is paramount in the evolving landscape of software development. Attacks on the software supply chain involve attackers injecting harmful software into commonly used packages or libraries in a software repository. For instance, JavaScript uses Node Package Manager (NPM), and Python uses Python Package Index (PyPi) as their respective package… ▽ More Protecting software supply chains from malicious packages is paramount in the evolving landscape of software development. Attacks on the software supply chain involve attackers injecting harmful software into commonly used packages or libraries in a software repository. For instance, JavaScript uses Node Package Manager (NPM), and Python uses Python Package Index (PyPi) as their respective package repositories. In the past, NPM has had vulnerabilities such as the event-stream incident, where a malicious package was introduced into a popular NPM package, potentially impacting a wide range of projects. As the integration of third-party packages becomes increasingly ubiquitous in modern software development, accelerating the creation and deployment of applications, the need for a robust detection mechanism has become critical. On the other hand, due to the sheer volume of new packages being released daily, the task of identifying malicious packages presents a significant challenge. To address this issue, in this paper, we introduce a metadata-based malicious package detection model, MeMPtec. This model extracts a set of features from package metadata information. These extracted features are classified as either easy-to-manipulate (ETM) or difficult-to-manipulate (DTM) features based on monotonicity and restricted control properties. By utilising these metadata features, not only do we improve the effectiveness of detecting malicious packages, but also we demonstrate its resistance to adversarial attacks in comparison with existing state-of-the-art. Our experiments indicate a significant reduction in both false positives (up to 97.56%) and false negatives (up to 91.86%). △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.02061 [pdf, ps, other]

Islamic Lifestyle Applications: Meeting the Spiritual Needs of Modern Muslims

Authors: Mohsinul Kabir, Mohammad Ridwan Kabir, Riasat Siam Islam

Abstract: We evaluated contemporary Islamic lifestyle applications supporting religious practices and motivation among Muslims. We reviewed 11 popular applications using self-determination theory and the technology-as-experience framework to assess their support for motivation and affective needs. Most applications lack features that foster autonomy, competence, and relatedness. We also interviewed ten devo… ▽ More We evaluated contemporary Islamic lifestyle applications supporting religious practices and motivation among Muslims. We reviewed 11 popular applications using self-determination theory and the technology-as-experience framework to assess their support for motivation and affective needs. Most applications lack features that foster autonomy, competence, and relatedness. We also interviewed ten devoted Muslim application users to gain insights into their experiences and unmet needs. Our findings indicate that existing applications fall short in providing comprehensive learning, social connections, and scholar consultations. We propose design implications based on our results, including guided religious information, shareability, virtual community engagement, scholarly question-answering, and personalized reminders. We aim to inform the design of Islamic lifestyle applications that better facilitate ritual practices, benefitting application designers and Muslim communities. Our research provides valuable insights into the untapped potential for lifestyle applications to act as religious companions supporting Muslims' spiritual journey. △ Less

Submitted 3 February, 2024; originally announced February 2024.

Comments: 23 pages

arXiv:2402.00201 [pdf, other]

An Experiment on Feature Selection using Logistic Regression

Authors: Raisa Islam, Subhasish Mazumdar, Rakibul Islam

Abstract: In supervised machine learning, feature selection plays a very important role by potentially enhancing explainability and performance as measured by computing time and accuracy-related metrics. In this paper, we investigate a method for feature selection based on the well-known L1 and L2 regularization strategies associated with logistic regression (LR). It is well known that the learned coefficie… ▽ More In supervised machine learning, feature selection plays a very important role by potentially enhancing explainability and performance as measured by computing time and accuracy-related metrics. In this paper, we investigate a method for feature selection based on the well-known L1 and L2 regularization strategies associated with logistic regression (LR). It is well known that the learned coefficients, which serve as weights, can be used to rank the features. Our approach is to synthesize the findings of L1 and L2 regularization. For our experiment, we chose the CIC-IDS2018 dataset owing partly to its size and also to the existence of two problematic classes that are hard to separate. We report first with the exclusion of one of them and then with its inclusion. We ranked features first with L1 and then with L2, and then compared logistic regression with L1 (LR+L1) against that with L2 (LR+L2) by varying the sizes of the feature sets for each of the two rankings. We found no significant difference in accuracy between the two methods once the feature set is selected. We chose a synthesis, i.e., only those features that were present in both the sets obtained from L1 and that from L2, and experimented with it on more complex models like Decision Tree and Random Forest and observed that the accuracy was very close in spite of the small size of the feature set. Additionally, we also report on the standard metrics: accuracy, precision, recall, and f1-score. △ Less

Submitted 31 January, 2024; originally announced February 2024.

arXiv:2401.07725 [pdf]

doi 10.5121/ijwmn.2023.15603

Effects Investigation of MAC and PHY Layer Parameters on the Performance of IEEE 802.15.6 CSMA/CA

Authors: Md. Abubakar Siddik, Most. Anju Ara Hasi, Md. Rajiul Islam, Jakia Akter Nitu

Abstract: The recently released IEEE 802.15.6 standard specifies several physical (PHY) layer and medium access control (MAC) layer protocols for variety of medical and non-medical applications of Wireless Body Area Networks (WBAN). The most suitable way for enhancing network performance is to be the choice of different MAC and PHY parameters based on quality of service (QoS) requirements of different appli… ▽ More The recently released IEEE 802.15.6 standard specifies several physical (PHY) layer and medium access control (MAC) layer protocols for variety of medical and non-medical applications of Wireless Body Area Networks (WBAN). The most suitable way for enhancing network performance is to be the choice of different MAC and PHY parameters based on quality of service (QoS) requirements of different applications. The impact of different MAC and PHY parameters on the network performance and the trade-off relationship between the parameters are essential to overcome the limitations of exiting carrier sense multiple access with collision avoidance (CSMA/CA) scheme of IEEE 802.15.6 standard. To address this issue, we develop a Markov chain-based analytical model of IEEE 802.15.6 CSMA/CA for all user priorities (UPs) and apply this general model to different network scenarios to investigate the effects of the packet arrival rate, channel condition, payload size, access phase length, access mechanism and number of nodes on the performance parameters viz. reliability, normalized throughput, energy consumption and average access delay. Moreover, we conclude the effectiveness of different access phases, access mechanisms and user priorities of intra-WBAN. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: 22 pages. arXiv admin note: substantial text overlap with arXiv:2209.00247

arXiv:2401.02523 [pdf, other]

Image-based Deep Learning for Smart Digital Twins: a Review

Authors: Md Ruman Islam, Mahadevan Subramaniam, Pei-Chi Huang

Abstract: Smart Digital twins (SDTs) are being increasingly used to virtually replicate and predict the behaviors of complex physical systems through continual data assimilation enabling the optimization of the performance of these systems by controlling the actions of systems. Recently, deep learning (DL) models have significantly enhanced the capabilities of SDTs, particularly for tasks such as predictive… ▽ More Smart Digital twins (SDTs) are being increasingly used to virtually replicate and predict the behaviors of complex physical systems through continual data assimilation enabling the optimization of the performance of these systems by controlling the actions of systems. Recently, deep learning (DL) models have significantly enhanced the capabilities of SDTs, particularly for tasks such as predictive maintenance, anomaly detection, and optimization. In many domains, including medicine, engineering, and education, SDTs use image data (image-based SDTs) to observe and learn system behaviors and control their behaviors. This paper focuses on various approaches and associated challenges in developing image-based SDTs by continually assimilating image data from physical systems. The paper also discusses the challenges involved in designing and implementing DL models for SDTs, including data acquisition, processing, and interpretation. In addition, insights into the future directions and opportunities for developing new image-based DL approaches to develop robust SDTs are provided. This includes the potential for using generative models for data augmentation, developing multi-modal DL models, and exploring the integration of DL with other technologies, including 5G, edge computing, and IoT. In this paper, we describe the image-based SDTs, which enable broader adoption of the digital twin DT paradigms across a broad spectrum of areas and the development of new methods to improve the abilities of SDTs in replicating, predicting, and optimizing the behavior of complex systems. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 12 pages, 2 figures, and 3 tables

arXiv:2312.15738 [pdf]

Enhanced Robot Motion Block of A-star Algorithm for Robotic Path Planning

Authors: Raihan Kabir, Yutaka Watanobe, Md. Rashedul Islam, Keitaro Naruse

Abstract: An efficient robot path-planning model is vulnerable to the number of search nodes, path cost, and time complexity. The conventional A-star (A*) algorithm outperforms other grid-based algorithms for its heuristic search. However it shows suboptimal performance for the time, space, and number of search nodes, depending on the robot motion block (RMB). To address this challenge, this study proposes… ▽ More An efficient robot path-planning model is vulnerable to the number of search nodes, path cost, and time complexity. The conventional A-star (A*) algorithm outperforms other grid-based algorithms for its heuristic search. However it shows suboptimal performance for the time, space, and number of search nodes, depending on the robot motion block (RMB). To address this challenge, this study proposes an optimal RMB for the A* path-planning algorithm to enhance the performance, where the robot movement costs are calculated by the proposed adaptive cost function. Also, a selection process is proposed to select the optimal RMB size. In this proposed model, grid-based maps are used, where the robot's next move is determined based on the adaptive cost function by searching among surrounding octet neighborhood grid cells. The cumulative value from the output data arrays is used to determine the optimal motion block size, which is formulated based on parameters. The proposed RMB significantly affects the searching time complexity and number of search nodes of the A* algorithm while maintaining almost the same path cost to find the goal position by avoiding obstacles. For the experiment, a benchmarked online dataset is used and prepared three different dimensional maps. The proposed approach is validated using approximately 7000 different grid maps with various dimensions and obstacle environments. The proposed model with an optimal RMB demonstrated a remarkable improvement of 93.98% in the number of search cells and 98.94% in time complexity compared to the conventional A* algorithm. Path cost for the proposed model remained largely comparable to other state-of-the-art algorithms. Also, the proposed model outperforms other state-of-the-art algorithms. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 15 pages, 14 figures

arXiv:2312.01301 [pdf, other]

Churn Prediction via Multimodal Fusion Learning:Integrating Customer Financial Literacy, Voice, and Behavioral Data

Authors: David Hason Rudd, Huan Huo, Md Rafiqul Islam, Guandong Xu

Abstract: In todays competitive landscape, businesses grapple with customer retention. Churn prediction models, although beneficial, often lack accuracy due to the reliance on a single data source. The intricate nature of human behavior and high dimensional customer data further complicate these efforts. To address these concerns, this paper proposes a multimodal fusion learning model for identifying custom… ▽ More In todays competitive landscape, businesses grapple with customer retention. Churn prediction models, although beneficial, often lack accuracy due to the reliance on a single data source. The intricate nature of human behavior and high dimensional customer data further complicate these efforts. To address these concerns, this paper proposes a multimodal fusion learning model for identifying customer churn risk levels in financial service providers. Our multimodal approach integrates customer sentiments financial literacy (FL) level, and financial behavioral data, enabling more accurate and bias-free churn prediction models. The proposed FL model utilizes a SMOGN COREG supervised model to gauge customer FL levels from their financial data. The baseline churn model applies an ensemble artificial neural network and oversampling techniques to predict churn propensity in high-dimensional financial data. We also incorporate a speech emotion recognition model employing a pre-trained CNN-VGG16 to recognize customer emotions based on pitch, energy, and tone. To integrate these diverse features while retaining unique insights, we introduced late and hybrid fusion techniques that complementary boost coordinated multimodal co learning. Robust metrics were utilized to evaluate the proposed multimodal fusion model and hence the approach validity, including mean average precision and macro-averaged F1 score. Our novel approach demonstrates a marked improvement in churn prediction, achieving a test accuracy of 91.2%, a Mean Average Precision (MAP) score of 66, and a Macro-Averaged F1 score of 54 through the proposed hybrid fusion learning technique compared with late fusion and baseline models. Furthermore, the analysis demonstrates a positive correlation between negative emotions, low FL scores, and high-risk customers. △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2311.16276 [pdf]

Darknet Traffic Analysis A Systematic Literature Review

Authors: Javeriah Saleem, Rafiqul Islam, Zahidul Islam

Abstract: The primary objective of an anonymity tool is to protect the anonymity of its users through the implementation of strong encryption and obfuscation techniques. As a result, it becomes very difficult to monitor and identify users activities on these networks. Moreover, such systems have strong defensive mechanisms to protect users against potential risks, including the extraction of traffic charact… ▽ More The primary objective of an anonymity tool is to protect the anonymity of its users through the implementation of strong encryption and obfuscation techniques. As a result, it becomes very difficult to monitor and identify users activities on these networks. Moreover, such systems have strong defensive mechanisms to protect users against potential risks, including the extraction of traffic characteristics and website fingerprinting. However, the strong anonymity feature also functions as a refuge for those involved in illicit activities who aim to avoid being traced on the network. As a result, a substantial body of research has been undertaken to examine and classify encrypted traffic using machine learning techniques. This paper presents a comprehensive examination of the existing approaches utilized for the categorization of anonymous traffic as well as encrypted network traffic inside the darknet. Also, this paper presents a comprehensive analysis of methods of darknet traffic using machine learning techniques to monitor and identify the traffic attacks inside the darknet. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 35 Pages, 13 Figures

arXiv:2311.16180 [pdf]

Aiming to Minimize Alcohol-Impaired Road Fatalities: Utilizing Fairness-Aware and Domain Knowledge-Infused Artificial Intelligence

Authors: Tejas Venkateswaran, Sheikh Rabiul Islam, Md Golam Moula Mehedi Hasan, Mohiuddin Ahmed

Abstract: Approximately 30% of all traffic fatalities in the United States are attributed to alcohol-impaired driving. This means that, despite stringent laws against this offense in every state, the frequency of drunk driving accidents is alarming, resulting in approximately one person being killed every 45 minutes. The process of charging individuals with Driving Under the Influence (DUI) is intricate and… ▽ More Approximately 30% of all traffic fatalities in the United States are attributed to alcohol-impaired driving. This means that, despite stringent laws against this offense in every state, the frequency of drunk driving accidents is alarming, resulting in approximately one person being killed every 45 minutes. The process of charging individuals with Driving Under the Influence (DUI) is intricate and can sometimes be subjective, involving multiple stages such as observing the vehicle in motion, interacting with the driver, and conducting Standardized Field Sobriety Tests (SFSTs). Biases have been observed through racial profiling, leading to some groups and geographical areas facing fewer DUI tests, resulting in many actual DUI incidents going undetected, ultimately leading to a higher number of fatalities. To tackle this issue, our research introduces an Artificial Intelligence-based predictor that is both fairness-aware and incorporates domain knowledge to analyze DUI-related fatalities in different geographic locations. Through this model, we gain intriguing insights into the interplay between various demographic groups, including age, race, and income. By utilizing the provided information to allocate policing resources in a more equitable and efficient manner, there is potential to reduce DUI-related fatalities and have a significant impact on road safety. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Comments: IEEE Big Data 2023

arXiv:2311.14883 [pdf]

Predicting Potential School Shooters from Social Media Posts

Authors: Alana Cedeno, Rachel Liang, Sheikh Rabiul Islam

Abstract: The rate of terror attacks has surged over the past decade, resulting in the tragic and senseless loss or alteration of numerous lives. Offenders behind mass shootings, bombings, or other domestic terrorism incidents have historically exhibited warning signs on social media before carrying out actual incidents. However, due to inadequate and comprehensive police procedures, authorities and social… ▽ More The rate of terror attacks has surged over the past decade, resulting in the tragic and senseless loss or alteration of numerous lives. Offenders behind mass shootings, bombings, or other domestic terrorism incidents have historically exhibited warning signs on social media before carrying out actual incidents. However, due to inadequate and comprehensive police procedures, authorities and social media platforms are often unable to detect these early indicators of intent. To tackle this issue, we aim to create a multimodal model capable of predicting sentiments simultaneously from both images (i.e., social media photos) and text (i.e., social media posts), generating a unified prediction. The proposed method involves segregating the image and text components of an online post and utilizing a captioning model to generate sentences summarizing the image's contents. Subsequently, a sentiment analyzer evaluates this caption, or description, along with the original post's text to determine whether the post is positive (i.e., concerning) or negative (i.e., benign). This undertaking represents a significant step toward implementing the developed system in real-world scenarios. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Journal ref: IEEE Big Data 2023

arXiv:2311.03534 [pdf, other]

PcLast: Discovering Plannable Continuous Latent States

Authors: Anurag Koul, Shivakanth Sujit, Shaoru Chen, Ben Evans, Lili Wu, Byron Xu, Rajan Chari, Riashat Islam, Raihan Seraj, Yonathan Efroni, Lekan Molu, Miro Dudik, John Langford, Alex Lamb

Abstract: Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effe… ▽ More Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effective planning and goal-conditioned policy learning. We first learn a latent representation with multi-step inverse dynamics (to remove distracting information), and then transform this representation to associate reachable states together in $\ell_2$ space. Our proposals are rigorously tested in various simulation testbeds. Numerical results in reward-based settings show significant improvements in sampling efficiency. Further, in reward-free settings this approach yields layered state abstractions that enable computationally efficient hierarchical planning for reaching ad hoc goals with zero additional samples. △ Less

Submitted 10 June, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: Accepted at ICML 2024

arXiv:2311.00767 [pdf, other]

doi 10.1109/IJCNN55064.2022.9892631

Hand Gesture Classification on Praxis Dataset: Trading Accuracy for Expense

Authors: Rahat Islam, Kenneth Lai, Svetlana Yanushkevich

Abstract: In this paper, we investigate hand gesture classifiers that rely upon the abstracted 'skeletal' data recorded using the RGB-Depth sensor. We focus on 'skeletal' data represented by the body joint coordinates, from the Praxis dataset. The PRAXIS dataset contains recordings of patients with cortical pathologies such as Alzheimer's disease, performing a Praxis test under the direction of a clinician.… ▽ More In this paper, we investigate hand gesture classifiers that rely upon the abstracted 'skeletal' data recorded using the RGB-Depth sensor. We focus on 'skeletal' data represented by the body joint coordinates, from the Praxis dataset. The PRAXIS dataset contains recordings of patients with cortical pathologies such as Alzheimer's disease, performing a Praxis test under the direction of a clinician. In this paper, we propose hand gesture classifiers that are more effective with the PRAXIS dataset than previously proposed models. Body joint data offers a compressed form of data that can be analyzed specifically for hand gesture recognition. Using a combination of windowing techniques with deep learning architecture such as a Recurrent Neural Network (RNN), we achieved an overall accuracy of 70.8% using only body joint data. In addition, we investigated a long-short-term-memory (LSTM) to extract and analyze the movement of the joints through time to recognize the hand gestures being performed and achieved a gesture recognition rate of 74.3% and 67.3% for static and dynamic gestures, respectively. The proposed approach contributed to the task of developing an automated, accurate, and inexpensive approach to diagnosing cortical pathologies for multiple healthcare applications. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: 8 pages, 6 figures

Journal ref: 2022 International Joint Conference on Neural Networks (IJCNN), Padua, pp. 1-8

arXiv:2310.20474 [pdf, other]

Critical Role of Artificially Intelligent Conversational Chatbot

Authors: Seraj A. M. Mostafa, Md Z. Islam, Mohammad Z. Islam, Fairose Jeehan, Saujanna Jafreen, Raihan U. Islam

Abstract: Artificially intelligent chatbot, such as ChatGPT, represents a recent and powerful advancement in the AI domain. Users prefer them for obtaining quick and precise answers, avoiding the usual hassle of clicking through multiple links in traditional searches. ChatGPT's conversational approach makes it comfortable and accessible for finding answers quickly and in an organized manner. However, it is… ▽ More Artificially intelligent chatbot, such as ChatGPT, represents a recent and powerful advancement in the AI domain. Users prefer them for obtaining quick and precise answers, avoiding the usual hassle of clicking through multiple links in traditional searches. ChatGPT's conversational approach makes it comfortable and accessible for finding answers quickly and in an organized manner. However, it is important to note that these chatbots have limitations, especially in terms of providing accurate answers as well as ethical concerns. In this study, we explore various scenarios involving ChatGPT's ethical implications within academic contexts, its limitations, and the potential misuse by specific user groups. To address these challenges, we propose architectural solutions aimed at preventing inappropriate use and promoting responsible AI interactions. △ Less

Submitted 31 October, 2023; originally announced October 2023.

Comments: Extended version of Conversation 2023 position paper

arXiv:2310.17139 [pdf, other]

Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning

Authors: Hongyu Zang, Xin Li, Leiji Zhang, Yang Liu, Baigui Sun, Riashat Islam, Remi Tachet des Combes, Romain Laroche

Abstract: While bisimulation-based approaches hold promise for learning robust state representations for Reinforcement Learning (RL) tasks, their efficacy in offline RL tasks has not been up to par. In some instances, their performance has even significantly underperformed alternative methods. We aim to understand why bisimulation methods succeed in online settings, but falter in offline tasks. Our analysis… ▽ More While bisimulation-based approaches hold promise for learning robust state representations for Reinforcement Learning (RL) tasks, their efficacy in offline RL tasks has not been up to par. In some instances, their performance has even significantly underperformed alternative methods. We aim to understand why bisimulation methods succeed in online settings, but falter in offline tasks. Our analysis reveals that missing transitions in the dataset are particularly harmful to the bisimulation principle, leading to ineffective estimation. We also shed light on the critical role of reward scaling in bounding the scale of bisimulation measurements and of the value error they induce. Based on these findings, we propose to apply the expectile operator for representation learning to our offline RL setting, which helps to prevent overfitting to incomplete data. Meanwhile, by introducing an appropriate reward scaling strategy, we avoid the risk of feature collapse in representation space. We implement these recommendations on two state-of-the-art bisimulation-based algorithms, MICo and SimSR, and demonstrate performance gains on two benchmark suites: D4RL and Visual D4RL. Codes are provided at \url{https://github.com/zanghyu/Offline_Bisimulation}. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Comments: NeurIPS 2023

arXiv:2310.03122 [pdf, other]

SPH-based framework for modelling fluid-structure interaction problems with finite deformation and fracturing

Authors: Md Rushdie Ibne Islam

Abstract: Understanding crack propagation in structures subjected to fluid loads is crucial in various engineering applications, ranging from underwater pipelines to aircraft components. This study investigates the dynamic response of structures, including their damage and fracture behaviour under hydrodynamic load, emphasizing the fluid-structure interaction (FSI) phenomena by applying Smoothed Particle Hy… ▽ More Understanding crack propagation in structures subjected to fluid loads is crucial in various engineering applications, ranging from underwater pipelines to aircraft components. This study investigates the dynamic response of structures, including their damage and fracture behaviour under hydrodynamic load, emphasizing the fluid-structure interaction (FSI) phenomena by applying Smoothed Particle Hydrodynamics (SPH). The developed framework employs weakly compressible SPH (WCSPH) to model the fluid flow and a pseudo-spring-based SPH solver for modelling the structural response. For improved accuracy in FSI modelling, the $δ$-SPH technique is implemented to enhance pressure calculations within the fluid phase. The pseudo-spring analogy is employed for modelling material damage, where particle interactions are confined to their immediate neighbours. These particles are linked by springs, which don't contribute to system stiffness but determine the interaction strength between connected pairs. It is assumed that a crack propagates through a spring connecting a particle pair when the damage indicator of that spring exceeds a predefined threshold. The developed framework is extensively validated through a dam break case, oscillation of a deformable solid beam, dam break through a deformable elastic solid, and breaking dam impact on a deformable solid obstacle. Numerical outcomes are subsequently compared with the findings from existing literature. The ability of the framework to accurately depict material damage and fracture is showcased through a simulation of water impact on a deformable solid obstacle with an initial notch. △ Less

Submitted 22 September, 2023; originally announced October 2023.

arXiv:2309.12617

Analyzing the Influence of Processor Speed and Clock Speed on Remaining Useful Life Estimation of Software Systems

Authors: M. Rubyet Islam, Peter Sandborn

Abstract: Prognostics and Health Management (PHM) is a discipline focused on predicting the point at which systems or components will cease to perform as intended, typically measured as Remaining Useful Life (RUL). RUL serves as a vital decision-making tool for contingency planning, guiding the timing and nature of system maintenance. Historically, PHM has primarily been applied to hardware systems, with it… ▽ More Prognostics and Health Management (PHM) is a discipline focused on predicting the point at which systems or components will cease to perform as intended, typically measured as Remaining Useful Life (RUL). RUL serves as a vital decision-making tool for contingency planning, guiding the timing and nature of system maintenance. Historically, PHM has primarily been applied to hardware systems, with its application to software only recently explored. In a recent study we introduced a methodology and demonstrated how changes in software can impact the RUL of software. However, in practical software development, real-time performance is also influenced by various environmental attributes, including operating systems, clock speed, processor performance, RAM, machine core count and others. This research extends the analysis to assess how changes in environmental attributes, such as operating system and clock speed, affect RUL estimation in software. Findings are rigorously validated using real performance data from controlled test beds and compared with predictive model-generated data. Statistical validation, including regression analysis, supports the credibility of the results. The controlled test bed environment replicates and validates faults from real applications, ensuring a standardized assessment platform. This exploration yields actionable knowledge for software maintenance and optimization strategies, addressing a significant gap in the field of software health management. △ Less

Submitted 9 March, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: I do not wish to put this in arXiv anymore. I need a total re-work on this paper for which I do nor have any time for. I realize that this data /paper should not be be on this archive anymore

arXiv:2308.05179 [pdf]

JutePestDetect: An Intelligent Approach for Jute Pest Identification Using Fine-Tuned Transfer Learning

Authors: Md. Simul Hasan Talukder, Mohammad Raziuddin Chowdhury, Md Sakib Ullah Sourav, Abdullah Al Rakin, Shabbir Ahmed Shuvo, Rejwan Bin Sulaiman, Musarrat Saberin Nipun, Muntarin Islam, Mst Rumpa Islam, Md Aminul Islam, Zubaer Haque

Abstract: In certain Asian countries, Jute is one of the primary sources of income and Gross Domestic Product (GDP) for the agricultural sector. Like many other crops, Jute is prone to pest infestations, and its identification is typically made visually in countries like Bangladesh, India, Myanmar, and China. In addition, this method is time-consuming, challenging, and somewhat imprecise, which poses a subs… ▽ More In certain Asian countries, Jute is one of the primary sources of income and Gross Domestic Product (GDP) for the agricultural sector. Like many other crops, Jute is prone to pest infestations, and its identification is typically made visually in countries like Bangladesh, India, Myanmar, and China. In addition, this method is time-consuming, challenging, and somewhat imprecise, which poses a substantial financial risk. To address this issue, the study proposes a high-performing and resilient transfer learning (TL) based JutePestDetect model to identify jute pests at the early stage. Firstly, we prepared jute pest dataset containing 17 classes and around 380 photos per pest class, which were evaluated after manual and automatic pre-processing and cleaning, such as background removal and resizing. Subsequently, five prominent pre-trained models -DenseNet201, InceptionV3, MobileNetV2, VGG19, and ResNet50 were selected from a previous study to design the JutePestDetect model. Each model was revised by replacing the classification layer with a global average pooling layer and incorporating a dropout layer for regularization. To evaluate the models performance, various metrics such as precision, recall, F1 score, ROC curve, and confusion matrix were employed. These analyses provided additional insights for determining the efficacy of the models. Among them, the customized regularized DenseNet201-based proposed JutePestDetect model outperformed the others, achieving an impressive accuracy of 99%. As a result, our proposed method and strategy offer an enhanced approach to pest identification in the case of Jute, which can significantly benefit farmers worldwide. △ Less

Submitted 28 May, 2023; originally announced August 2023.

Comments: 29 Pages, 7 Tables, 7 Figures, 5 Appendix

arXiv:2308.03200 [pdf, other]

doi 10.1038/s41598-023-51015-1

Uncovering local aggregated air quality index with smartphone captured images leveraging efficient deep convolutional neural network

Authors: Joyanta Jyoti Mondal, Md. Farhadul Islam, Raima Islam, Nowsin Kabir Rhidi, Sarfaraz Newaz, Meem Arafat Manab, A. B. M. Alim Al Islam, Jannatun Noor

Abstract: The prevalence and mobility of smartphones make these a widely used tool for environmental health research. However, their potential for determining aggregated air quality index (AQI) based on PM2.5 concentration in specific locations remains largely unexplored in the existing literature. In this paper, we thoroughly examine the challenges associated with predicting location-specific PM2.5 concent… ▽ More The prevalence and mobility of smartphones make these a widely used tool for environmental health research. However, their potential for determining aggregated air quality index (AQI) based on PM2.5 concentration in specific locations remains largely unexplored in the existing literature. In this paper, we thoroughly examine the challenges associated with predicting location-specific PM2.5 concentration using images taken with smartphone cameras. The focus of our study is on Dhaka, the capital of Bangladesh, due to its significant air pollution levels and the large population exposed to it. Our research involves the development of a Deep Convolutional Neural Network (DCNN), which we train using over a thousand outdoor images taken and annotated. These photos are captured at various locations in Dhaka, and their labels are based on PM2.5 concentration data obtained from the local US consulate, calculated using the NowCast algorithm. Through supervised learning, our model establishes a correlation index during training, enhancing its ability to function as a Picture-based Predictor of PM2.5 Concentration (PPPC). This enables the algorithm to calculate an equivalent daily averaged AQI index from a smartphone image. Unlike, popular overly parameterized models, our model shows resource efficiency since it uses fewer parameters. Furthermore, test results indicate that our model outperforms popular models like ViT and INN, as well as popular CNN-based models such as VGG19, ResNet50, and MobileNetV2, in predicting location-specific PM2.5 concentration. Our dataset is the first publicly available collection that includes atmospheric images and corresponding PM2.5 measurements from Dhaka. Our codes and dataset are available at https://github.com/lepotatoguy/aqi. △ Less

Submitted 18 January, 2024; v1 submitted 6 August, 2023; originally announced August 2023.

Comments: 18 pages, 7 figures, published to Nature Scientific Reports

Journal ref: Sci Rep 14, 1627 (2024)

arXiv:2307.13164 [pdf]

Malware Resistant Data Protection in Hyper-connected Networks: A survey

Authors: Jannatul Ferdous, Rafiqul Islam, Maumita Bhattacharya, Md Zahidul Islam

Abstract: Data protection is the process of securing sensitive information from being corrupted, compromised, or lost. A hyperconnected network, on the other hand, is a computer networking trend in which communication occurs over a network. However, what about malware. Malware is malicious software meant to penetrate private data, threaten a computer system, or gain unauthorised network access without the u… ▽ More Data protection is the process of securing sensitive information from being corrupted, compromised, or lost. A hyperconnected network, on the other hand, is a computer networking trend in which communication occurs over a network. However, what about malware. Malware is malicious software meant to penetrate private data, threaten a computer system, or gain unauthorised network access without the users consent. Due to the increasing applications of computers and dependency on electronically saved private data, malware attacks on sensitive information have become a dangerous issue for individuals and organizations across the world. Hence, malware defense is critical for keeping our computer systems and data protected. Many recent survey articles have focused on either malware detection systems or single attacking strategies variously. To the best of our knowledge, no survey paper demonstrates malware attack patterns and defense strategies combinedly. Through this survey, this paper aims to address this issue by merging diverse malicious attack patterns and machine learning (ML) based detection models for modern and sophisticated malware. In doing so, we focus on the taxonomy of malware attack patterns based on four fundamental dimensions the primary goal of the attack, method of attack, targeted exposure and execution process, and types of malware that perform each attack. Detailed information on malware analysis approaches is also investigated. In addition, existing malware detection techniques employing feature extraction and ML algorithms are discussed extensively. Finally, it discusses research difficulties and unsolved problems, including future research directions. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: 30 pages, 9 figures, 7 tables, no where submitted yet

MSC Class: F.2.2; I.2.7 ACM-class: F.2.2; I.2.7 j.2.1 ACM Class: F.2.2; I.2.7

arXiv:2307.12237 [pdf]

Demonstration of a Response Time Based Remaining Useful Life (RUL) Prediction for Software Systems

Authors: Ray Islam, Peter Sandborn

Abstract: Prognostic and Health Management (PHM) has been widely applied to hardware systems in the electronics and non-electronics domains but has not been explored for software. While software does not decay over time, it can degrade over release cycles. Software health management is confined to diagnostic assessments that identify problems, whereas prognostic assessment potentially indicates when in the… ▽ More Prognostic and Health Management (PHM) has been widely applied to hardware systems in the electronics and non-electronics domains but has not been explored for software. While software does not decay over time, it can degrade over release cycles. Software health management is confined to diagnostic assessments that identify problems, whereas prognostic assessment potentially indicates when in the future a problem will become detrimental. Relevant research areas such as software defect prediction, software reliability prediction, predictive maintenance of software, software degradation, and software performance prediction, exist, but all of these represent diagnostic models built upon historical data, none of which can predict an RUL for software. This paper addresses the application of PHM concepts to software systems for fault predictions and RUL estimation. Specifically, this paper addresses how PHM can be used to make decisions for software systems such as version update and upgrade, module changes, system reengineering, rejuvenation, maintenance scheduling, budgeting, and total abandonment. This paper presents a method to prognostically and continuously predict the RUL of a software system based on usage parameters (e.g., the numbers and categories of releases) and performance parameters (e.g., response time). The model developed has been validated by comparing actual data, with the results that were generated by predictive models. Statistical validation (regression validation, and k-fold cross validation) has also been carried out. A case study, based on publicly available data for the Bugzilla application is presented. This case study demonstrates that PHM concepts can be applied to software systems and RUL can be calculated to make system management decisions. △ Less

Submitted 23 July, 2023; originally announced July 2023.

Comments: This research methodology has opened up new and practical applications in the software domain. In the coming decades, we can expect a significant amount of attention and practical implementation in this area worldwide

Journal ref: Published in the Journal of Prognostics and Health Management in March, 2023

arXiv:2306.00421 [pdf, other]

Introduction to Medical Imaging Informatics

Authors: Md. Zihad Bin Jahangir, Ruksat Hossain, Riadul Islam, MD Abdullah Al Nasim, Md. Mahim Anjum Haque, Md Jahangir Alam, Sajedul Talukder

Abstract: Medical imaging informatics is a rapidly growing field that combines the principles of medical imaging and informatics to improve the acquisition, management, and interpretation of medical images. This chapter introduces the basic concepts of medical imaging informatics, including image processing, feature engineering, and machine learning. It also discusses the recent advancements in computer vis… ▽ More Medical imaging informatics is a rapidly growing field that combines the principles of medical imaging and informatics to improve the acquisition, management, and interpretation of medical images. This chapter introduces the basic concepts of medical imaging informatics, including image processing, feature engineering, and machine learning. It also discusses the recent advancements in computer vision and deep learning technologies and how they are used to develop new quantitative image markers and prediction models for disease detection, diagnosis, and prognosis prediction. By covering the basic knowledge of medical imaging informatics, this chapter provides a foundation for understanding the role of informatics in medicine and its potential impact on patient care. △ Less

Submitted 17 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 18 pages, 11 figures, 2 tables; Acceptance of the chapter for the Springer book "Data-driven approaches to medical imaging"

arXiv:2305.08014 [pdf]

Surface EMG-Based Inter-Session/Inter-Subject Gesture Recognition by Leveraging Lightweight All-ConvNet and Transfer Learning

Authors: Md. Rabiul Islam, Daniel Massicotte, Philippe Y. Massicotte, Wei-Ping Zhu

Abstract: Gesture recognition using low-resolution instantaneous HD-sEMG images opens up new avenues for the development of more fluid and natural muscle-computer interfaces. However, the data variability between inter-session and inter-subject scenarios presents a great challenge. The existing approaches employed very large and complex deep ConvNet or 2SRNN-based domain adaptation methods to approximate th… ▽ More Gesture recognition using low-resolution instantaneous HD-sEMG images opens up new avenues for the development of more fluid and natural muscle-computer interfaces. However, the data variability between inter-session and inter-subject scenarios presents a great challenge. The existing approaches employed very large and complex deep ConvNet or 2SRNN-based domain adaptation methods to approximate the distribution shift caused by these inter-session and inter-subject data variability. Hence, these methods also require learning over millions of training parameters and a large pre-trained and target domain dataset in both the pre-training and adaptation stages. As a result, it makes high-end resource-bounded and computationally very expensive for deployment in real-time applications. To overcome this problem, we propose a lightweight All-ConvNet+TL model that leverages lightweight All-ConvNet and transfer learning (TL) for the enhancement of inter-session and inter-subject gesture recognition performance. The All-ConvNet+TL model consists solely of convolutional layers, a simple yet efficient framework for learning invariant and discriminative representations to address the distribution shifts caused by inter-session and inter-subject data variability. Experiments on four datasets demonstrate that our proposed methods outperform the most complex existing approaches by a large margin and achieve state-of-the-art results on inter-session and inter-subject scenarios and perform on par or competitively on intra-session gesture recognition. These performance gaps increase even more when a tiny amount (e.g., a single trial) of data is available on the target domain for adaptation. These outstanding experimental results provide evidence that the current state-of-the-art models may be overparameterized for sEMG-based inter-session and inter-subject gesture recognition tasks. △ Less

Submitted 19 February, 2024; v1 submitted 13 May, 2023; originally announced May 2023.

arXiv:2305.05111 [pdf, other]

When a CBR in Hand is Better than Twins in the Bush

Authors: Mobyen Uddin Ahmed, Shaibal Barua, Shahina Begum, Mir Riyanul Islam, Rosina O Weber

Abstract: AI methods referred to as interpretable are often discredited as inaccurate by supporters of the existence of a trade-off between interpretability and accuracy. In many problem contexts however this trade-off does not hold. This paper discusses a regression problem context to predict flight take-off delays where the most accurate data regression model was trained via the XGBoost implementation of… ▽ More AI methods referred to as interpretable are often discredited as inaccurate by supporters of the existence of a trade-off between interpretability and accuracy. In many problem contexts however this trade-off does not hold. This paper discusses a regression problem context to predict flight take-off delays where the most accurate data regression model was trained via the XGBoost implementation of gradient boosted decision trees. While building an XGB-CBR Twin and converting the XGBoost feature importance into global weights in the CBR model, the resultant CBR model alone provides the most accurate local prediction, maintains the global importance to provide a global explanation of the model, and offers the most interpretable representation for local explanations. This resultant CBR model becomes a benchmark of accuracy and interpretability for this problem context, and hence it is used to evaluate the two additive feature attribute methods SHAP and LIME to explain the XGBoost regression model. The results with respect to local accuracy and feature attribution lead to potentially valuable future work. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: The version of this paper published in ICCBR XCBR '22 contained an erroneous sum in Equation 3 that we have corrected in this version

Journal ref: ICCBR XCBR '22: 4th Workshop on XCBR: Case-based Reasoning for the Explanation of Intelligent Systems at ICCBR-2022, September, 2022, Nancy, France

arXiv:2305.01486 [pdf, other]

ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression Learning

Authors: Azmine Toushik Wasi, Karlo Šerbetar, Raima Islam, Taki Hasan Rafi, Dong-Kyu Chae

Abstract: In this paper, we introduce a framework ARBEx, a novel attentive feature extraction framework driven by Vision Transformer with reliability balancing to cope against poor class distributions, bias, and uncertainty in the facial expression learning (FEL) task. We reinforce several data pre-processing and refinement methods along with a window-based cross-attention ViT to squeeze the best of the dat… ▽ More In this paper, we introduce a framework ARBEx, a novel attentive feature extraction framework driven by Vision Transformer with reliability balancing to cope against poor class distributions, bias, and uncertainty in the facial expression learning (FEL) task. We reinforce several data pre-processing and refinement methods along with a window-based cross-attention ViT to squeeze the best of the data. We also employ learnable anchor points in the embedding space with label distributions and multi-head self-attention mechanism to optimize performance against weak predictions with reliability balancing, which is a strategy that leverages anchor points, attention scores, and confidence values to enhance the resilience of label predictions. To ensure correct label classification and improve the models' discriminative power, we introduce anchor loss, which encourages large margins between anchor points. Additionally, the multi-head self-attention mechanism, which is also trainable, plays an integral role in identifying accurate labels. This approach provides critical elements for improving the reliability of predictions and has a substantial positive effect on final prediction capabilities. Our adaptive model can be integrated with any deep neural network to forestall challenges in various recognition tasks. Our strategy outperforms current state-of-the-art methodologies, according to extensive experiments conducted in a variety of contexts. △ Less

Submitted 14 July, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

Comments: 12 pages, 7 figures. Code: https://github.com/takihasan/ARBEx

Showing 1–50 of 204 results for author: Islam, R