Search | arXiv e-print repository

FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering

Authors: Md Sirajul Islam, Simin Javaherian, Fei Xu, Xu Yuan, Li Chen, Nian-Feng Tzeng

Abstract: Federated learning (FL) is an emerging distributed machine learning paradigm that enables collaborative training of machine learning models over decentralized devices without exposing their local data. One of the major challenges in FL is the presence of uneven data distributions across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training sa… ▽ More Federated learning (FL) is an emerging distributed machine learning paradigm that enables collaborative training of machine learning models over decentralized devices without exposing their local data. One of the major challenges in FL is the presence of uneven data distributions across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training samples in conventional machine learning. To address the performance degradation issue incurred by such data heterogeneity, clustered federated learning (CFL) shows its promise by grouping clients into separate learning clusters based on the similarity of their local data distributions. However, state-of-the-art CFL approaches require a large number of communication rounds to learn the distribution similarities during training until the formation of clusters is stabilized. Moreover, some of these algorithms heavily rely on a predefined number of clusters, thus limiting their flexibility and adaptability. In this paper, we propose {\em FedClust}, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients. {\em FedClust} groups clients into clusters in a one-shot manner by measuring the similarity degrees among clients based on the strategically selected partial weights of locally trained models. We conduct extensive experiments on four benchmark datasets with different non-IID data settings. Experimental results demonstrate that {\em FedClust} achieves higher model accuracy up to $\sim$45\% as well as faster convergence with a significantly reduced communication cost up to 2.7$\times$ compared to its state-of-the-art counterparts. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.01225 [pdf, other]

Hong-Ou-Mandel Interference with a Coexisting Clock using Transceivers for Synchronization over Deployed Fiber

Authors: Anirudh Ramesh, Daniel R. Reilly, Kim Fook Lee, Paul M. Moraw, Joaquin Chung, Md Shariful Islam, Cristián Peña, Xu Han, Rajkumar Kettimuthu, Prem Kumar, Gregory Kanter

Abstract: Interference between independently generated photons is a key step towards distributing entanglement over long distances, but it requires synchronization between the distantly-located photon sources. Synchronizing the clocks of such photon sources using coexisting two-way classical optical communications over the same fiber that transport the quantum photonic signals is a promising approach for ac… ▽ More Interference between independently generated photons is a key step towards distributing entanglement over long distances, but it requires synchronization between the distantly-located photon sources. Synchronizing the clocks of such photon sources using coexisting two-way classical optical communications over the same fiber that transport the quantum photonic signals is a promising approach for achieving photon-photon interference over long distances, enabling entanglement distribution for quantum networking using the deployed fiber infrastructure. Here, we demonstrate photon-photon interference by observing the Hong-Ou-Mandel dip between two distantly-located sources: a weak coherent state source obtained by attenuating the output of a laser and a heralded single-photon source. We achieve a maximum dip visibility of $0.58 \pm 0.04$ when the two sources are connected via $4.3$ km of deployed fiber. Dip visibilities $>0.5$ are nonclassical and a first step towards achieving teleportation over the deployed fiber infrastructure. In our experiment, the classical optical communication is achieved with $-21$ dBm of optical signal launch power, which is used to synchronize the clocks in the two independent, distantly-located photon sources. The impact of spontaneous Raman scattering from the classical optical signals is mitigated by appropriate choice of the quantum and classical channel wavelengths. All equipment used in our experiment (the photon sources and the synchronization setup) is commercially available. Finally, our experiment represents a scalable approach to enabling practical quantum networking with commercial equipment and coexistence with classical communications in optical fiber. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.14856 [pdf, other]

Accessible, At-Home Detection of Parkinson's Disease via Multi-task Video Analysis

Authors: Md Saiful Islam, Tariq Adnan, Jan Freyberg, Sangwu Lee, Abdelrahman Abdelkader, Meghan Pawlik, Cathe Schwartz, Karen Jaffe, Ruth B. Schneider, E Ray Dorsey, Ehsan Hoque

Abstract: Limited access to neurological care leads to missed diagnoses of Parkinson's disease (PD), leaving many individuals unidentified and untreated. We trained a novel neural network-based fusion architecture to detect Parkinson's disease (PD) by analyzing features extracted from webcam recordings of three tasks: finger tapping, facial expression (smiling), and speech (uttering a sentence containing al… ▽ More Limited access to neurological care leads to missed diagnoses of Parkinson's disease (PD), leaving many individuals unidentified and untreated. We trained a novel neural network-based fusion architecture to detect Parkinson's disease (PD) by analyzing features extracted from webcam recordings of three tasks: finger tapping, facial expression (smiling), and speech (uttering a sentence containing all letters of the alphabet). Additionally, the model incorporated Monte Carlo Dropout to improve prediction accuracy by considering uncertainties. The study participants (n = 845, 272 with PD) were randomly split into three sets: 60% for training, 20% for model selection (hyper-parameter tuning), and 20% for final performance evaluation. The dataset consists of 1102 sessions, each session containing videos of all three tasks. Our proposed model achieved significantly better accuracy, area under the ROC curve (AUROC), and sensitivity at non-inferior specificity compared to any single-task model. Withholding uncertain predictions further boosted the performance, achieving 88.0% (95% CI: 87.7% - 88.4%) accuracy, 93.0% (92.8% - 93.2%) AUROC, 79.3% (78.4% - 80.2%) sensitivity, and 92.6% (92.3% - 92.8%) specificity, at the expense of not being able to predict for 2.3% (2.0% - 2.6%) data. Further analysis suggests that the trained model does not exhibit any detectable bias across sex and ethnic subgroups and is most effective for individuals aged between 50 and 80. This accessible, low-cost approach requiring only an internet-enabled device with a webcam and microphone paves the way for convenient PD screening at home, particularly in regions with limited access to clinical specialists. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.12954 [pdf, other]

Skin Cancer Images Classification using Transfer Learning Techniques

Authors: Md Sirajul Islam, Sanjeev Panta

Abstract: Skin cancer is one of the most common and deadliest types of cancer. Early diagnosis of skin cancer at a benign stage is critical to reducing cancer mortality. To detect skin cancer at an earlier stage an automated system is compulsory that can save the life of many patients. Many previous studies have addressed the problem of skin cancer diagnosis using various deep learning and transfer learning… ▽ More Skin cancer is one of the most common and deadliest types of cancer. Early diagnosis of skin cancer at a benign stage is critical to reducing cancer mortality. To detect skin cancer at an earlier stage an automated system is compulsory that can save the life of many patients. Many previous studies have addressed the problem of skin cancer diagnosis using various deep learning and transfer learning models. However, existing literature has limitations in its accuracy and time-consuming procedure. In this work, we applied five different pre-trained transfer learning approaches for binary classification of skin cancer detection at benign and malignant stages. To increase the accuracy of these models we fine-tune different layers and activation functions. We used a publicly available ISIC dataset to evaluate transfer learning approaches. For model stability, data augmentation techniques are applied to improve the randomness of the input dataset. These approaches are evaluated using different hyperparameters such as batch sizes, epochs, and optimizers. The experimental results show that the ResNet-50 model provides an accuracy of 0.935, F1-score of 0.86, and precision of 0.94. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.08534 [pdf, ps, other]

Optimizing Container Loading and Unloading through Dual-Cycling and Dockyard Rehandle Reduction Using a Hybrid Genetic Algorithm

Authors: Md. Mahfuzur Rahman, Md Abrar Jahin, Md. Saiful Islam, M. F. Mridha

Abstract: This paper addresses the optimization of container unloading and loading operations at ports, integrating quay-crane dual-cycling with dockyard rehandle minimization. We present a unified model encompassing both operations: ship container unloading and loading by quay crane, and the other is reducing dockyard rehandles while loading the ship. We recognize that optimizing one aspect in isolation ca… ▽ More This paper addresses the optimization of container unloading and loading operations at ports, integrating quay-crane dual-cycling with dockyard rehandle minimization. We present a unified model encompassing both operations: ship container unloading and loading by quay crane, and the other is reducing dockyard rehandles while loading the ship. We recognize that optimizing one aspect in isolation can lead to suboptimal outcomes due to interdependencies. Specifically, optimizing unloading sequences for minimal operation time may inadvertently increase dockyard rehandles during loading and vice versa. To address this NP-hard problem, we propose a hybrid genetic algorithm (GA) QCDC-DR-GA comprising one-dimensional and two-dimensional GA components. Our model, QCDC-DR-GA, consistently outperforms four state-of-the-art methods in maximizing dual cycles and minimizing dockyard rehandles. Compared to those methods, it reduced 15-20% of total operation time for large vessels. Statistical validation through a two-tailed paired t-test confirms the superiority of QCDC-DR-GA at a 5% significance level. The approach effectively combines QCDC optimization with dockyard rehandle minimization, optimizing the total unloading-loading time. Results underscore the inefficiency of separately optimizing QCDC and dockyard rehandles. Fragmented approaches, such as QCDC Scheduling Optimized by bi-level GA and GA-ILSRS (Scenario 2), show limited improvement compared to QCDC-DR-GA. As in GA-ILSRS (Scenario 1), neglecting dual-cycle optimization leads to inferior performance than QCDC-DR-GA. This emphasizes the necessity of simultaneously considering both aspects for optimal resource utilization and overall operational efficiency. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.07707 [pdf, other]

A Deep Learning Approach to Detect Complete Safety Equipment For Construction Workers Based On YOLOv7

Authors: Md. Shariful Islam, SM Shaqib, Shahriar Sultan Ramit, Shahrun Akter Khushbu, Mr. Abdus Sattar, Dr. Sheak Rashed Haider Noori

Abstract: In the construction sector, ensuring worker safety is of the utmost significance. In this study, a deep learning-based technique is presented for identifying safety gear worn by construction workers, such as helmets, goggles, jackets, gloves, and footwears. The recommended approach uses the YOLO v7 (You Only Look Once) object detection algorithm to precisely locate these safety items. The dataset… ▽ More In the construction sector, ensuring worker safety is of the utmost significance. In this study, a deep learning-based technique is presented for identifying safety gear worn by construction workers, such as helmets, goggles, jackets, gloves, and footwears. The recommended approach uses the YOLO v7 (You Only Look Once) object detection algorithm to precisely locate these safety items. The dataset utilized in this work consists of labeled images split into training, testing and validation sets. Each image has bounding box labels that indicate where the safety equipment is located within the image. The model is trained to identify and categorize the safety equipment based on the labeled dataset through an iterative training approach. We used custom dataset to train this model. Our trained model performed admirably well, with good precision, recall, and F1-score for safety equipment recognition. Also, the model's evaluation produced encouraging results, with a mAP@0.5 score of 87.7\%. The model performs effectively, making it possible to quickly identify safety equipment violations on building sites. A thorough evaluation of the outcomes reveals the model's advantages and points up potential areas for development. By offering an automatic and trustworthy method for safety equipment detection, this research makes a contribution to the fields of computer vision and workplace safety. The proposed deep learning-based approach will increase safety compliance and reduce the risk of accidents in the construction industry △ Less

Submitted 13 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.00257 [pdf, other]

Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs

Authors: Mohammed Saidul Islam, Raian Rahman, Ahmed Masry, Md Tahmid Rahman Laskar, Mir Tafseer Nayeem, Enamul Hoque

Abstract: Natural language is a powerful complementary modality of communication for data visualizations, such as bar and line charts. To facilitate chart-based reasoning using natural language, various downstream tasks have been introduced recently such as chart question answering, chart summarization, and fact-checking with charts. These tasks pose a unique challenge, demanding both vision-language reason… ▽ More Natural language is a powerful complementary modality of communication for data visualizations, such as bar and line charts. To facilitate chart-based reasoning using natural language, various downstream tasks have been introduced recently such as chart question answering, chart summarization, and fact-checking with charts. These tasks pose a unique challenge, demanding both vision-language reasoning and a nuanced understanding of chart data tables, visual encodings, and natural language prompts. Despite the recent success of Large Language Models (LLMs) across diverse NLP tasks, their abilities and limitations in the realm of data visualization remain under-explored, possibly due to their lack of multi-modal capabilities. To bridge the gap, this paper presents the first comprehensive evaluation of the recently developed large vision language models (LVLMs) for chart understanding and reasoning tasks. Our evaluation includes a comprehensive assessment of LVLMs, including GPT-4V and Gemini, across four major chart reasoning tasks. Furthermore, we perform a qualitative evaluation of LVLMs' performance on a diverse range of charts, aiming to provide a thorough analysis of their strengths and weaknesses. Our findings reveal that LVLMs demonstrate impressive abilities in generating fluent texts covering high-level data insights while also encountering common problems like hallucinations, factual errors, and data bias. We highlight the key strengths and limitations of chart comprehension tasks, offering insights for future research. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.17206 [pdf, other]

A Novel Fusion Architecture for PD Detection Using Semi-Supervised Speech Embeddings

Authors: Tariq Adnan, Abdelrahman Abdelkader, Zipei Liu, Ekram Hossain, Sooyong Park, MD Saiful Islam, Ehsan Hoque

Abstract: We present a framework to recognize Parkinson's disease (PD) through an English pangram utterance speech collected using a web application from diverse recording settings and environments, including participants' homes. Our dataset includes a global cohort of 1306 participants, including 392 diagnosed with PD. Leveraging the diversity of the dataset, spanning various demographic properties (such a… ▽ More We present a framework to recognize Parkinson's disease (PD) through an English pangram utterance speech collected using a web application from diverse recording settings and environments, including participants' homes. Our dataset includes a global cohort of 1306 participants, including 392 diagnosed with PD. Leveraging the diversity of the dataset, spanning various demographic properties (such as age, sex, and ethnicity), we used deep learning embeddings derived from semi-supervised models such as Wav2Vec 2.0, WavLM, and ImageBind representing the speech dynamics associated with PD. Our novel fusion model for PD classification, which aligns different speech embeddings into a cohesive feature space, demonstrated superior performance over standard concatenation-based fusion models and other baselines (including models built on traditional acoustic features). In a randomized data split configuration, the model achieved an Area Under the Receiver Operating Characteristic Curve (AUROC) of 88.94% and an accuracy of 85.65%. Rigorous statistical analysis confirmed that our model performs equitably across various demographic subgroups in terms of sex, ethnicity, and age, and remains robust regardless of disease duration. Furthermore, our model, when tested on two entirely unseen test datasets collected from clinical settings and from a PD care center, maintained AUROC scores of 82.12% and 78.44%, respectively. This affirms the model's robustness and it's potential to enhance accessibility and health equity in real-world applications. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 25 pages, 5 figures, and 4 tables

arXiv:2405.15813 [pdf, other]

doi 10.1145/3664815

From CNNs to Transformers in Multimodal Human Action Recognition: A Survey

Authors: Muhammad Bilal Shaikh, Syed Mohammed Shamsul Islam, Douglas Chai, Naveed Akhtar

Abstract: Due to its widespread applications, human action recognition is one of the most widely studied research problems in Computer Vision. Recent studies have shown that addressing it using multimodal data leads to superior performance as compared to relying on a single data modality. During the adoption of deep learning for visual modelling in the last decade, action recognition approaches have mainly… ▽ More Due to its widespread applications, human action recognition is one of the most widely studied research problems in Computer Vision. Recent studies have shown that addressing it using multimodal data leads to superior performance as compared to relying on a single data modality. During the adoption of deep learning for visual modelling in the last decade, action recognition approaches have mainly relied on Convolutional Neural Networks (CNNs). However, the recent rise of Transformers in visual modelling is now also causing a paradigm shift for the action recognition task. This survey captures this transition while focusing on Multimodal Human Action Recognition (MHAR). Unique to the induction of multimodal computational models is the process of "fusing" the features of the individual data modalities. Hence, we specifically focus on the fusion design aspects of the MHAR approaches. We analyze the classic and emerging techniques in this regard, while also highlighting the popular trends in the adaption of CNN and Transformer building blocks for the overall problem. In particular, we emphasize on recent design choices that have led to more efficient MHAR models. Unlike existing reviews, which discuss Human Action Recognition from a broad perspective, this survey is specifically aimed at pushing the boundaries of MHAR research by identifying promising architectural and fusion design choices to train practicable models. We also provide an outlook of the multimodal datasets from their scale and evaluation viewpoint. Finally, building on the reviewed literature, we discuss the challenges and future avenues for MHAR. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 23 pages, 5 figures and 3 Tables. To appear in ACM Trans. Multimedia Comput. Commun. Appl.(TOMM) 2024

ACM Class: A.1; I.2.10

arXiv:2405.13219 [pdf, other]

How Reliable AI Chatbots are for Disease Prediction from Patient Complaints?

Authors: Ayesha Siddika Nipu, K M Sajjadul Islam, Praveen Madiraju

Abstract: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making. This study examines the reliability of AI chatbots, specifically GPT 4.0, Claude 3 Opus, and Gemini Ultra 1.0, in predicting diseases from patient complaints in the emergency department. The methodolo… ▽ More Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making. This study examines the reliability of AI chatbots, specifically GPT 4.0, Claude 3 Opus, and Gemini Ultra 1.0, in predicting diseases from patient complaints in the emergency department. The methodology includes few-shot learning techniques to evaluate the chatbots' effectiveness in disease prediction. We also fine-tune the transformer-based model BERT and compare its performance with the AI chatbots. Results suggest that GPT 4.0 achieves high accuracy with increased few-shot data, while Gemini Ultra 1.0 performs well with fewer examples, and Claude 3 Opus maintains consistent performance. BERT's performance, however, is lower than all the chatbots, indicating limitations due to limited labeled data. Despite the chatbots' varying accuracy, none of them are sufficiently reliable for critical medical decision-making, underscoring the need for rigorous validation and human oversight. This study reflects that while AI chatbots have potential in healthcare, they should complement, not replace, human expertise to ensure patient safety. Further refinement and research are needed to improve AI-based healthcare applications' reliability for disease prediction. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 24th IEEE International Conference on Information Reuse and Integration (IEEE IRI 2024), San Jose, CA, USA

arXiv:2405.11188 [pdf, other]

Wind Power Prediction across Different Locations using Deep Domain Adaptive Learning

Authors: Md Saiful Islam Sajol, Md Shazid Islam, A S M Jahid Hasan, Md Saydur Rahman, Jubair Yusuf

Abstract: Accurate prediction of wind power is essential for the grid integration of this intermittent renewable source and aiding grid planners in forecasting available wind capacity. Spatial differences lead to discrepancies in climatological data distributions between two geographically dispersed regions, consequently making the prediction task more difficult. Thus, a prediction model that learns from th… ▽ More Accurate prediction of wind power is essential for the grid integration of this intermittent renewable source and aiding grid planners in forecasting available wind capacity. Spatial differences lead to discrepancies in climatological data distributions between two geographically dispersed regions, consequently making the prediction task more difficult. Thus, a prediction model that learns from the data of a particular climatic region can suffer from being less robust. A deep neural network (DNN) based domain adaptive approach is proposed to counter this drawback. Effective weather features from a large set of weather parameters are selected using a random forest approach. A pre-trained model from the source domain is utilized to perform the prediction task, assuming no source data is available during target domain prediction. The weights of only the last few layers of the DNN model are updated throughout the task, keeping the rest of the network unchanged, making the model faster compared to the traditional approaches. The proposed approach demonstrates higher accuracy ranging from 6.14% to even 28.44% compared to the traditional non-adaptive method. △ Less

Submitted 18 May, 2024; originally announced May 2024.

arXiv:2405.05972 [pdf]

Simulation of Ge on Si Photodiode with photon-trapping micro-nano holes with -3dB bandwidth of >60 GHz at NIR wavelength

Authors: Ekaterina Ponizovskaya Devine, Toshishige Yamada, Shih-Yuan Wang, M Saif Islam

Abstract: The study proposes an ultra-thin back side illuminated (BSI) and top-illuminated, Ge on Si photodetector (PD), for 1 to 1.4 microns wavelength range. The Ge thickness of 350 nm allows us to achieve high-speed performance at >60 GHz, while the nanostructure at the bottom of the Ge layer helps to increase the optical absorption efficiency to above 80%. The BSI PD allows the PD or PD array wafer to b… ▽ More The study proposes an ultra-thin back side illuminated (BSI) and top-illuminated, Ge on Si photodetector (PD), for 1 to 1.4 microns wavelength range. The Ge thickness of 350 nm allows us to achieve high-speed performance at >60 GHz, while the nanostructure at the bottom of the Ge layer helps to increase the optical absorption efficiency to above 80%. The BSI PD allows the PD or PD array wafer to be stacked with an electronic wafer for signal processing and transmission for optical interconnect applications such as short-reach links in data centers. Nano-microhole parameters in randomized composite formation on the bottom layer are optimized with Monte-Carlo molecular dynamics simulations incorporating charge transport to enable wide-spectral, highly efficient, and ultra-fast PDs. △ Less

Submitted 9 April, 2024; originally announced May 2024.

Comments: 12 pages, 3 figures

arXiv:2405.02729 [pdf, ps, other]

Ulam's method for computing stationary densities of invariant measures for piecewise convex maps with countably infinite number of branches

Authors: Md Shafiqul Islam, Paweł Góra, A H M Mahbubur Rahman

Abstract: Let $τ: I=[0, 1]\to [0, 1]$ be a piecewise convex map with countably infinite number of branches. In \cite{GIR}, the existence of absolutely continuous invariant measure (ACIM) $μ$ for $τ$ and the exactness of the system $(τ, μ)$ has been proven. In this paper, we develop an Ulam method for approximation of $f^*$, the density of ACIM $μ$. We construct a sequence $\{τ_n\}_{n=1}^\infty$ of maps… ▽ More Let $τ: I=[0, 1]\to [0, 1]$ be a piecewise convex map with countably infinite number of branches. In \cite{GIR}, the existence of absolutely continuous invariant measure (ACIM) $μ$ for $τ$ and the exactness of the system $(τ, μ)$ has been proven. In this paper, we develop an Ulam method for approximation of $f^*$, the density of ACIM $μ$. We construct a sequence $\{τ_n\}_{n=1}^\infty$ of maps $τ_n: I\to I$ s. t. $τ_n$ has a finite number of branches and the sequence $τ_n$ converges to $τ$ almost uniformly. Using supremum norms and Lasota-Yorke type inequalities, we prove the existence of ACIMs $μ_n$ for $τ_n$ with the densities $f_n$. For a fixed $n$, we apply Ulam's method with $k$ subintervals to $τ_n$ and compute approximations $f_{n,k}$ of $f_n$. We prove that $f_{n,k}\to f^*$ as $n\to \infty, k\to \infty,$ both a.e. and in $L^1$. We provide examples of piecewise convex maps $τ$ with countably infinite number of branches, their approximations $τ_n$ with finite number of branches and for increasing values of parameter $k$ show the errors $\|f^*-f_{n,k}\|_1$. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: 19 pages, 9 figures

MSC Class: 37E05 (Primary) 37M15 (Secondary)

arXiv:2404.15090 [pdf]

Galerkin-Bernstein Approximations for the System of Third-Order Nonlinear Boundary Value Problems

Authors: Snigdha Dhar, Md. Shafiqul Islam

Abstract: This paper is devoted to find the numerical solutions of one dimensional general nonlinear system of third-order boundary value problems (BVPs) for the pair of functions using Galerkin weighted residual method. We derive mathematical formulations in matrix form, in details, by exploiting Bernstein polynomials as basis functions. A reasonable accuracy is found when the proposed method is used on fe… ▽ More This paper is devoted to find the numerical solutions of one dimensional general nonlinear system of third-order boundary value problems (BVPs) for the pair of functions using Galerkin weighted residual method. We derive mathematical formulations in matrix form, in details, by exploiting Bernstein polynomials as basis functions. A reasonable accuracy is found when the proposed method is used on few examples. At the end of the study, a comparison is made between the approximate and exact solutions, and also with the solutions of the existing methods. Our results converge monotonically to the exact solutions. In addition, we show that the the derived formulations may be applicable by reducing higher order complicated BVP into a lower order system of BVPs, and the performance of the numerical solutions is satisfactory. △ Less

Submitted 23 April, 2024; originally announced April 2024.

MSC Class: 65L60

arXiv:2404.06873 [pdf, other]

Aluminium nanoparticle-based ultra-wideband high-performance polarizer

Authors: Md. Shariful Islam, Ahmed Zubair

Abstract: The polarizer-based device industry is expanding quickly, requiring high-quality research on nanoscale wideband polarizers. Here, we investigated the possibility of utilizing Al dimer nanostructures on broad-band polarizers. Metals are always considered promising candidates for reflection-based polarizer development because of their high extinction ratio. This study proposes a nanoparticle polariz… ▽ More The polarizer-based device industry is expanding quickly, requiring high-quality research on nanoscale wideband polarizers. Here, we investigated the possibility of utilizing Al dimer nanostructures on broad-band polarizers. Metals are always considered promising candidates for reflection-based polarizer development because of their high extinction ratio. This study proposes a nanoparticle polarizer comprised of semi-immersed Al nanodimers with a 200 nm radius on a CaF_2 substrate. Our proposed polarizer has effective polarization anisotropy in the near-infrared (NIR) and THz range. This study includes calculating performance parameters for the extraction of the proposed polarizer, including insertion loss, extinction ratio (ER), Mueller matrix values, and polarization ellipse diagram. The finite-difference time-domain (FDTD) simulation-based results suggested obtaining more than 55 dB extinction ratio for the 0.2 to 9 THz range. The average extinction ratio and insertion loss over the 1-1665 micrometer wavelength were 29.01 dB and ~1 dB, respectively. We have reviewed recent reports of similar nanoparticle and wire grid-based polarizers to evaluate our Al nanodimer-based polarizer and performed a comparative analysis. The idea of Al dimer and the insight gained from the results extracted from the rigorous simulation report suggested a great opportunity for developing micro-scale metallic wideband polarizers. △ Less

Submitted 11 July, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.03338 [pdf]

Approximation of Some Nonlinear Fractional Order BVPs by Weighted Residual Methods

Authors: Umme Ruman, Md. Shafiqul Islam

Abstract: To extract the approximate solutions in the case of nonlinear fractional order differential equations with the homogeneous and nonhomogeneous boundary conditions, the weighted residual method is embedded here. We exploit three methods such as Galerkin, Least Square, and Collocation for the efficient numerical solution of nonlinear two-point boundary value problems. Some nonlinear cases are examine… ▽ More To extract the approximate solutions in the case of nonlinear fractional order differential equations with the homogeneous and nonhomogeneous boundary conditions, the weighted residual method is embedded here. We exploit three methods such as Galerkin, Least Square, and Collocation for the efficient numerical solution of nonlinear two-point boundary value problems. Some nonlinear cases are examined for observing the maximum absolute errors by the considered methods, demonstrating the accuracy and reliability of the present technique using the modified Legendre and modified Bernoulli polynomials as weight functions. The mathematical formulations and computational algorithms are more straightforward and uncomplicated to understand. Absolute errors and the graphical representation reflect that our method is more accurate and reliable. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 14 Pages, 10 Figures

MSC Class: 65-XX

arXiv:2403.05519 [pdf, other]

doi 10.1145/3530691

Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT

Authors: Aisha Khatun, Anisur Rahman, Md Saiful Islam, Hemayet Ahmed Chowdhury, Ayesha Tasnim

Abstract: Authorship Attribution is the task of creating an appropriate characterization of text that captures the authors' writing style to identify the original author of a given piece of text. With increased anonymity on the internet, this task has become increasingly crucial in various security and plagiarism detection fields. Despite significant advancements in other languages such as English, Spanish,… ▽ More Authorship Attribution is the task of creating an appropriate characterization of text that captures the authors' writing style to identify the original author of a given piece of text. With increased anonymity on the internet, this task has become increasingly crucial in various security and plagiarism detection fields. Despite significant advancements in other languages such as English, Spanish, and Chinese, Bangla lacks comprehensive research in this field due to its complex linguistic feature and sentence structure. Moreover, existing systems are not scalable when the number of author increases, and the performance drops for small number of samples per author. In this paper, we propose the use of Average-Stochastic Gradient Descent Weight-Dropped Long Short-Term Memory (AWD-LSTM) architecture and an effective transfer learning approach that addresses the problem of complex linguistic features extraction and scalability for authorship attribution in Bangla Literature (AABL). We analyze the effect of different tokenization, such as word, sub-word, and character level tokenization, and demonstrate the effectiveness of these tokenizations in the proposed model. Moreover, we introduce the publicly available Bangla Authorship Attribution Dataset of 16 authors (BAAD16) containing 17,966 sample texts and 13.4+ million words to solve the standard dataset scarcity problem and release six variations of pre-trained language models for use in any Bangla NLP downstream task. For evaluation, we used our developed BAAD16 dataset as well as other publicly available datasets. Empirically, our proposed model outperformed state-of-the-art models and achieved 99.8% accuracy in the BAAD16 dataset. Furthermore, we showed that the proposed system scales much better even with an increasing number of authors, and performance remains steady despite few training samples. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: Accepted in ACM TALLIP August 2022

arXiv:2403.04144 [pdf, other]

FedClust: Optimizing Federated Learning on Non-IID Data through Weight-Driven Client Clustering

Authors: Md Sirajul Islam, Simin Javaherian, Fei Xu, Xu Yuan, Li Chen, Nian-Feng Tzeng

Abstract: Federated learning (FL) is an emerging distributed machine learning paradigm enabling collaborative model training on decentralized devices without exposing their local data. A key challenge in FL is the uneven data distribution across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training samples in conventional machine learning. Clustered fe… ▽ More Federated learning (FL) is an emerging distributed machine learning paradigm enabling collaborative model training on decentralized devices without exposing their local data. A key challenge in FL is the uneven data distribution across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training samples in conventional machine learning. Clustered federated learning (CFL) addresses this challenge by grouping clients based on the similarity of their data distributions. However, existing CFL approaches require a large number of communication rounds for stable cluster formation and rely on a predefined number of clusters, thus limiting their flexibility and adaptability. This paper proposes FedClust, a novel CFL approach leveraging correlations between local model weights and client data distributions. FedClust groups clients into clusters in a one-shot manner using strategically selected partial model weights and dynamically accommodates newcomers in real-time. Experimental results demonstrate FedClust outperforms baseline approaches in terms of accuracy and communication costs. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2402.18600 [pdf]

Artificial Intelligence and Diabetes Mellitus: An Inside Look Through the Retina

Authors: Yasin Sadeghi Bazargani, Majid Mirzaei, Navid Sobhi, Mirsaeed Abdollahi, Ali Jafarizadeh, Siamak Pedrammehr, Roohallah Alizadehsani, Ru San Tan, Sheikh Mohammed Shariful Islam, U. Rajendra Acharya

Abstract: Diabetes mellitus (DM) predisposes patients to vascular complications. Retinal images and vasculature reflect the body's micro- and macrovascular health. They can be used to diagnose DM complications, including diabetic retinopathy (DR), neuropathy, nephropathy, and atherosclerotic cardiovascular disease, as well as forecast the risk of cardiovascular events. Artificial intelligence (AI)-enabled s… ▽ More Diabetes mellitus (DM) predisposes patients to vascular complications. Retinal images and vasculature reflect the body's micro- and macrovascular health. They can be used to diagnose DM complications, including diabetic retinopathy (DR), neuropathy, nephropathy, and atherosclerotic cardiovascular disease, as well as forecast the risk of cardiovascular events. Artificial intelligence (AI)-enabled systems developed for high-throughput detection of DR using digitized retinal images have become clinically adopted. Beyond DR screening, AI integration also holds immense potential to address challenges associated with the holistic care of the patient with DM. In this work, we aim to comprehensively review the literature for studies on AI applications based on retinal images related to DM diagnosis, prognostication, and management. We will describe the findings of holistic AI-assisted diabetes care, including but not limited to DR screening, and discuss barriers to implementing such systems, including issues concerning ethics, data privacy, equitable access, and explainability. With the ability to evaluate the patient's health status vis a vis DM complication as well as risk prognostication of future cardiovascular complications, AI-assisted retinal image analysis has the potential to become a central tool for modern personalized medicine in patients with DM. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 44 Pages, 6 figures, 1 table, 166 references

ACM Class: J.3.2; J.3.3

arXiv:2402.15728 [pdf]

Design and Implementation of Low-Cost Electric Vehicles (Evs) Supercharger: A Comprehensive Review

Authors: Md Khaledur Rahman, Faysal Amin Tanvir, Md Saiful Islam, Md Shameem Ahsan, Manam Ahmed

Abstract: This article presents a probabilistic modeling method utilizing smart meter data and an innovative agent-based simulator for electric vehicles (EVs). The aim is to assess the effects of different cost-driven EV charging strategies on the power distribution network (PDN). We investigate the effects of a 40% EV adoption on three parts of Frederiksberg's low voltage distribution network (LVDN), a den… ▽ More This article presents a probabilistic modeling method utilizing smart meter data and an innovative agent-based simulator for electric vehicles (EVs). The aim is to assess the effects of different cost-driven EV charging strategies on the power distribution network (PDN). We investigate the effects of a 40% EV adoption on three parts of Frederiksberg's low voltage distribution network (LVDN), a densely urbanized municipality in Denmark. Our findings indicate that cable and transformer overloading especially pose a challenge. However, the impact of EVs varies significantly between each LVDN area and charging scenario. Across scenarios and LVDNs, the share of cables facing congestion ranges between 5% and 60%. It is also revealed that time-of-use (ToU)-based and single-day cost-minimized charging could be beneficial for LVDNs with moderate EV adoption rates. In contrast, multiple-day optimization will likely lead to severe congestion, as such strategies concentrate demand on a single day that would otherwise be distributed over several days, thus raising concerns about how to prevent it. The broader implications of our research suggest that, despite initial worries primarily centered on congestion due to unregulated charging during peak hours, a transition to cost-based smart charging, propelled by an increasing awareness of time-dependent electricity prices, may lead to a significant rise in charging synchronization, bringing about undesirable consequences for the power distribution network (PDN). △ Less

Submitted 24 February, 2024; originally announced February 2024.

arXiv:2402.09975 [pdf]

Current and future roles of artificial intelligence in retinopathy of prematurity

Authors: Ali Jafarizadeh, Shadi Farabi Maleki, Parnia Pouya, Navid Sobhi, Mirsaeed Abdollahi, Siamak Pedrammehr, Chee Peng Lim, Houshyar Asadi, Roohallah Alizadehsani, Ru-San Tan, Sheikh Mohammad Shariful Islam, U. Rajendra Acharya

Abstract: Retinopathy of prematurity (ROP) is a severe condition affecting premature infants, leading to abnormal retinal blood vessel growth, retinal detachment, and potential blindness. While semi-automated systems have been used in the past to diagnose ROP-related plus disease by quantifying retinal vessel features, traditional machine learning (ML) models face challenges like accuracy and overfitting. R… ▽ More Retinopathy of prematurity (ROP) is a severe condition affecting premature infants, leading to abnormal retinal blood vessel growth, retinal detachment, and potential blindness. While semi-automated systems have been used in the past to diagnose ROP-related plus disease by quantifying retinal vessel features, traditional machine learning (ML) models face challenges like accuracy and overfitting. Recent advancements in deep learning (DL), especially convolutional neural networks (CNNs), have significantly improved ROP detection and classification. The i-ROP deep learning (i-ROP-DL) system also shows promise in detecting plus disease, offering reliable ROP diagnosis potential. This research comprehensively examines the contemporary progress and challenges associated with using retinal imaging and artificial intelligence (AI) to detect ROP, offering valuable insights that can guide further investigation in this domain. Based on 89 original studies in this field (out of 1487 studies that were comprehensively reviewed), we concluded that traditional methods for ROP diagnosis suffer from subjectivity and manual analysis, leading to inconsistent clinical decisions. AI holds great promise for improving ROP management. This review explores AI's potential in ROP detection, classification, diagnosis, and prognosis. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: 28 pages, 8 figures, 2 tables, 235 references, 1 supplementary table

ACM Class: J.3.2; J.3.3

arXiv:2402.01804 [pdf, other]

Analysis of Internet of Things Implementation Barriers in the Cold Supply Chain: An Integrated ISM-MICMAC and DEMATEL Approach

Authors: Kazrin Ahmad, Md. Saiful Islam, Md Abrar Jahin, M. F. Mridha

Abstract: Integrating Internet of Things (IoT) technology inside the cold supply chain can enhance transparency, efficiency, and quality, optimizing operating procedures and increasing productivity. The integration of IoT in this complicated setting is hindered by specific barriers that need a thorough examination. Prominent barriers to IoT implementation in the cold supply chain are identified using a two-… ▽ More Integrating Internet of Things (IoT) technology inside the cold supply chain can enhance transparency, efficiency, and quality, optimizing operating procedures and increasing productivity. The integration of IoT in this complicated setting is hindered by specific barriers that need a thorough examination. Prominent barriers to IoT implementation in the cold supply chain are identified using a two-stage model. After reviewing the available literature on the topic of IoT implementation, a total of 13 barriers were found. The survey data was cross-validated for quality, and Cronbach's alpha test was employed to ensure validity. This research applies the interpretative structural modeling technique in the first phase to identify the main barriers. Among those barriers, "regularity compliance" and "cold chain networks" are key drivers for IoT adoption strategies. MICMAC's driving and dependence power element categorization helps evaluate the barrier interactions. In the second phase of this research, a decision-making trial and evaluation laboratory methodology was employed to identify causal relationships between barriers and evaluate them according to their relative importance. Each cause is a potential drive, and if its efficiency can be enhanced, the system as a whole benefits. The research findings provide industry stakeholders, governments, and organizations with significant drivers of IoT adoption to overcome these barriers and optimize the utilization of IoT technology to improve the effectiveness and reliability of the cold supply chain. △ Less

Submitted 27 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

arXiv:2402.01208 [pdf, other]

Location Agnostic Adaptive Rain Precipitation Prediction using Deep Learning

Authors: Md Shazid Islam, Md Saydur Rahman, Md Saad Ul Haque, Farhana Akter Tumpa, Md Sanzid Bin Hossain, Abul Al Arabi

Abstract: Rain precipitation prediction is a challenging task as it depends on weather and meteorological features which vary from location to location. As a result, a prediction model that performs well at one location does not perform well at other locations due to the distribution shifts. In addition, due to global warming, the weather patterns are changing very rapidly year by year which creates the pos… ▽ More Rain precipitation prediction is a challenging task as it depends on weather and meteorological features which vary from location to location. As a result, a prediction model that performs well at one location does not perform well at other locations due to the distribution shifts. In addition, due to global warming, the weather patterns are changing very rapidly year by year which creates the possibility of ineffectiveness of those models even at the same location as time passes. In our work, we have proposed an adaptive deep learning-based framework in order to provide a solution to the aforementioned challenges. Our method can generalize the model for the prediction of precipitation for any location where the methods without adaptation fail. Our method has shown 43.51%, 5.09%, and 38.62% improvement after adaptation using a deep neural network for predicting the precipitation of Paris, Los Angeles, and Tokyo, respectively. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2402.01206 [pdf, other]

Comparative Evaluation of Weather Forecasting using Machine Learning Models

Authors: Md Saydur Rahman, Farhana Akter Tumpa, Md Shazid Islam, Abul Al Arabi, Md Sanzid Bin Hossain, Md Saad Ul Haque

Abstract: Gaining a deeper understanding of weather and being able to predict its future conduct have always been considered important endeavors for the growth of our society. This research paper explores the advancements in understanding and predicting nature's behavior, particularly in the context of weather forecasting, through the application of machine learning algorithms. By leveraging the power of ma… ▽ More Gaining a deeper understanding of weather and being able to predict its future conduct have always been considered important endeavors for the growth of our society. This research paper explores the advancements in understanding and predicting nature's behavior, particularly in the context of weather forecasting, through the application of machine learning algorithms. By leveraging the power of machine learning, data mining, and data analysis techniques, significant progress has been made in this field. This study focuses on analyzing the contributions of various machine learning algorithms in predicting precipitation and temperature patterns using a 20-year dataset from a single weather station in Dhaka city. Algorithms such as Gradient Boosting, AdaBoosting, Artificial Neural Network, Stacking Random Forest, Stacking Neural Network, and Stacking KNN are evaluated and compared based on their performance metrics, including Confusion matrix measurements. The findings highlight remarkable achievements and provide valuable insights into their performances and features correlation. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.17574 [pdf, other]

Scavenging Hyena: Distilling Transformers into Long Convolution Models

Authors: Tokiniaina Raharison Ralambomihanta, Shahrad Mohammadzadeh, Mohammad Sami Nur Islam, Wassim Jabbour, Laurence Liang

Abstract: The rapid evolution of Large Language Models (LLMs), epitomized by architectures like GPT-4, has reshaped the landscape of natural language processing. This paper introduces a pioneering approach to address the efficiency concerns associated with LLM pre-training, proposing the use of knowledge distillation for cross-architecture transfer. Leveraging insights from the efficient Hyena mechanism, ou… ▽ More The rapid evolution of Large Language Models (LLMs), epitomized by architectures like GPT-4, has reshaped the landscape of natural language processing. This paper introduces a pioneering approach to address the efficiency concerns associated with LLM pre-training, proposing the use of knowledge distillation for cross-architecture transfer. Leveraging insights from the efficient Hyena mechanism, our method replaces attention heads in transformer models by Hyena, offering a cost-effective alternative to traditional pre-training while confronting the challenge of processing long contextual information, inherent in quadratic attention mechanisms. Unlike conventional compression-focused methods, our technique not only enhances inference speed but also surpasses pre-training in terms of both accuracy and efficiency. In the era of evolving LLMs, our work contributes to the pursuit of sustainable AI solutions, striking a balance between computational power and environmental impact. △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: 9 pages, 2 figures

arXiv:2401.16350 [pdf, other]

FedFair^3: Unlocking Threefold Fairness in Federated Learning

Authors: Simin Javaherian, Sanjeev Panta, Shelby Williams, Md Sirajul Islam, Li Chen

Abstract: Federated Learning (FL) is an emerging paradigm in machine learning without exposing clients' raw data. In practical scenarios with numerous clients, encouraging fair and efficient client participation in federated learning is of utmost importance, which is also challenging given the heterogeneity in data distribution and device properties. Existing works have proposed different client-selection m… ▽ More Federated Learning (FL) is an emerging paradigm in machine learning without exposing clients' raw data. In practical scenarios with numerous clients, encouraging fair and efficient client participation in federated learning is of utmost importance, which is also challenging given the heterogeneity in data distribution and device properties. Existing works have proposed different client-selection methods that consider fairness; however, they fail to select clients with high utilities while simultaneously achieving fair accuracy levels. In this paper, we propose a fair client-selection approach that unlocks threefold fairness in federated learning. In addition to having a fair client-selection strategy, we enforce an equitable number of rounds for client participation and ensure a fair accuracy distribution over the clients. The experimental results demonstrate that FedFair^3, in comparison to the state-of-the-art baselines, achieves 18.15% less accuracy variance on the IID data and 54.78% on the non-IID data, without decreasing the global accuracy. Furthermore, it shows 24.36% less wall-clock training time on average. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.15299 [pdf, other]

SupplyGraph: A Benchmark Dataset for Supply Chain Planning using Graph Neural Networks

Authors: Azmine Toushik Wasi, MD Shafikul Islam, Adipto Raihan Akib

Abstract: Graph Neural Networks (GNNs) have gained traction across different domains such as transportation, bio-informatics, language processing, and computer vision. However, there is a noticeable absence of research on applying GNNs to supply chain networks. Supply chain networks are inherently graph-like in structure, making them prime candidates for applying GNN methodologies. This opens up a world of… ▽ More Graph Neural Networks (GNNs) have gained traction across different domains such as transportation, bio-informatics, language processing, and computer vision. However, there is a noticeable absence of research on applying GNNs to supply chain networks. Supply chain networks are inherently graph-like in structure, making them prime candidates for applying GNN methodologies. This opens up a world of possibilities for optimizing, predicting, and solving even the most complex supply chain problems. A major setback in this approach lies in the absence of real-world benchmark datasets to facilitate the research and resolution of supply chain problems using GNNs. To address the issue, we present a real-world benchmark dataset for temporal tasks, obtained from one of the leading FMCG companies in Bangladesh, focusing on supply chain planning for production purposes. The dataset includes temporal data as node features to enable sales predictions, production planning, and the identification of factory issues. By utilizing this dataset, researchers can employ GNNs to address numerous supply chain problems, thereby advancing the field of supply chain analytics and planning. Source: https://github.com/CIOL-SUST/SupplyGraph △ Less

Submitted 27 January, 2024; originally announced January 2024.

Comments: 9 pages, 8 figures; Accepted to 4th workshop on Graphs and more Complex structures for Learning and Reasoning, colocated with AAAI 2024

ACM Class: I.2.1; I.2.8; E.0; J.2; H.3.7

arXiv:2401.14422 [pdf, other]

Location Agnostic Source-Free Domain Adaptive Learning to Predict Solar Power Generation

Authors: Md Shazid Islam, A S M Jahid Hasan, Md Saydur Rahman, Jubair Yusuf, Md Saiful Islam Sajol, Farhana Akter Tumpa

Abstract: The prediction of solar power generation is a challenging task due to its dependence on climatic characteristics that exhibit spatial and temporal variability. The performance of a prediction model may vary across different places due to changes in data distribution, resulting in a model that works well in one region but not in others. Furthermore, as a consequence of global warming, there is a no… ▽ More The prediction of solar power generation is a challenging task due to its dependence on climatic characteristics that exhibit spatial and temporal variability. The performance of a prediction model may vary across different places due to changes in data distribution, resulting in a model that works well in one region but not in others. Furthermore, as a consequence of global warming, there is a notable acceleration in the alteration of weather patterns on an annual basis. This phenomenon introduces the potential for diminished efficacy of existing models, even within the same geographical region, as time progresses. In this paper, a domain adaptive deep learning-based framework is proposed to estimate solar power generation using weather features that can solve the aforementioned challenges. A feed-forward deep convolutional network model is trained for a known location dataset in a supervised manner and utilized to predict the solar power of an unknown location later. This adaptive data-driven approach exhibits notable advantages in terms of computing speed, storage efficiency, and its ability to improve outcomes in scenarios where state-of-the-art non-adaptive methods fail. Our method has shown an improvement of $10.47 \%$, $7.44 \%$, $5.11\%$ in solar power prediction accuracy compared to best performing non-adaptive method for California (CA), Florida (FL) and New York (NY), respectively. △ Less

Submitted 6 February, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.06088 [pdf, other]

Autocompletion of Chief Complaints in the Electronic Health Records using Large Language Models

Authors: K M Sajjadul Islam, Ayesha Siddika Nipu, Praveen Madiraju, Priya Deshpande

Abstract: The Chief Complaint (CC) is a crucial component of a patient's medical record as it describes the main reason or concern for seeking medical care. It provides critical information for healthcare providers to make informed decisions about patient care. However, documenting CCs can be time-consuming for healthcare providers, especially in busy emergency departments. To address this issue, an autocom… ▽ More The Chief Complaint (CC) is a crucial component of a patient's medical record as it describes the main reason or concern for seeking medical care. It provides critical information for healthcare providers to make informed decisions about patient care. However, documenting CCs can be time-consuming for healthcare providers, especially in busy emergency departments. To address this issue, an autocompletion tool that suggests accurate and well-formatted phrases or sentences for clinical notes can be a valuable resource for triage nurses. In this study, we utilized text generation techniques to develop machine learning models using CC data. In our proposed work, we train a Long Short-Term Memory (LSTM) model and fine-tune three different variants of Biomedical Generative Pretrained Transformers (BioGPT), namely microsoft/biogpt, microsoft/BioGPT-Large, and microsoft/BioGPT-Large-PubMedQA. Additionally, we tune a prompt by incorporating exemplar CC sentences, utilizing the OpenAI API of GPT-4. We evaluate the models' performance based on the perplexity score, modified BERTScore, and cosine similarity score. The results show that BioGPT-Large exhibits superior performance compared to the other models. It consistently achieves a remarkably low perplexity score of 1.65 when generating CC, whereas the baseline LSTM model achieves the best perplexity score of 170. Further, we evaluate and assess the proposed models' performance and the outcome of GPT-4.0. Our study demonstrates that utilizing LLMs such as BioGPT, leads to the development of an effective autocompletion tool for generating CC documentation in healthcare settings. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: IEEE BigData 2023 - Sorrento, Italy. 10 Pages, 4 Figures, 5 Tables

arXiv:2312.15160 [pdf, other]

Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning

Authors: Md Saiful Islam, Srijita Das, Sai Krishna Gottipati, William Duguay, Clodéric Mars, Jalal Arabneydi, Antoine Fagette, Matthew Guzdial, Matthew-E-Taylor

Abstract: Recent advances in reinforcement learning (RL) and Human-in-the-Loop (HitL) learning have made human-AI collaboration easier for humans to team with AI agents. Leveraging human expertise and experience with AI in intelligent systems can be efficient and beneficial. Still, it is unclear to what extent human-AI collaboration will be successful, and how such teaming performs compared to humans or AI… ▽ More Recent advances in reinforcement learning (RL) and Human-in-the-Loop (HitL) learning have made human-AI collaboration easier for humans to team with AI agents. Leveraging human expertise and experience with AI in intelligent systems can be efficient and beneficial. Still, it is unclear to what extent human-AI collaboration will be successful, and how such teaming performs compared to humans or AI agents only. In this work, we show that learning from humans is effective and that human-AI collaboration outperforms human-controlled and fully autonomous AI agents in a complex simulation environment. In addition, we have developed a new simulator for critical infrastructure protection, focusing on a scenario where AI-powered drones and human teams collaborate to defend an airport against enemy drone attacks. We develop a user interface to allow humans to assist AI agents effectively. We demonstrated that agents learn faster while learning from policy correction compared to learning from humans or agents. Furthermore, human-AI collaboration requires lower mental and temporal demands, reduces human effort, and yields higher performance than if humans directly controlled all agents. In conclusion, we show that humans can provide helpful advice to the RL agents, allowing them to improve learning in a multi-agent setting. △ Less

Submitted 22 December, 2023; originally announced December 2023.

Comments: Submitted to Neural Computing and Applications

arXiv:2312.08381 [pdf]

An Explainable Machine Learning Framework for the Accurate Diagnosis of Ovarian Cancer

Authors: Asif Newaz, Abdullah Taharat, Md Sakibul Islam, A. G. M. Fuad Hasan Akanda

Abstract: Ovarian cancer (OC) is one of the most prevalent types of cancer in women. Early and accurate diagnosis is crucial for the survival of the patients. However, the majority of women are diagnosed in advanced stages due to the lack of effective biomarkers and accurate screening tools. While previous studies sought a common biomarker, our study suggests different biomarkers for the premenopausal and p… ▽ More Ovarian cancer (OC) is one of the most prevalent types of cancer in women. Early and accurate diagnosis is crucial for the survival of the patients. However, the majority of women are diagnosed in advanced stages due to the lack of effective biomarkers and accurate screening tools. While previous studies sought a common biomarker, our study suggests different biomarkers for the premenopausal and postmenopausal populations. This can provide a new perspective in the search for novel predictors for the effective diagnosis of OC. Lack of explainability is one major limitation of current AI systems. The stochastic nature of the ML algorithms raises concerns about the reliability of the system as it is difficult to interpret the reasons behind the decisions. To increase the trustworthiness and accountability of the diagnostic system as well as to provide transparency and explanations behind the predictions, explainable AI has been incorporated into the ML framework. SHAP is employed to quantify the contributions of the selected biomarkers and determine the most discriminative features. A hybrid decision support system has been established that can eliminate the bottlenecks caused by the black-box nature of the ML algorithms providing a safe and trustworthy AI tool. The diagnostic accuracy obtained from the proposed system outperforms the existing methods as well as the state-of-the-art ROMA algorithm by a substantial margin which signifies its potential to be an effective tool in the differential diagnosis of OC. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.06073 [pdf]

DFT based investigation of structural, elastic, optoelectronic, thermophysical and superconducting state properties of binary Mo3P at different pressures

Authors: Md. Sohel Rana, Razu Ahmed, Md. Sajidul Islam, R. S. Islam, S. H. Naqib

Abstract: In recent years, the investigation of novel materials for various technological applications has gained much importance in materials science research. Tri-molybdenum phosphide (Mo3P), a promising transition metal phosphide (TMP), has gathered significant attention due to its unique structural and electronic properties, which already make it potentially valuable system for catalytic and electronic… ▽ More In recent years, the investigation of novel materials for various technological applications has gained much importance in materials science research. Tri-molybdenum phosphide (Mo3P), a promising transition metal phosphide (TMP), has gathered significant attention due to its unique structural and electronic properties, which already make it potentially valuable system for catalytic and electronic device applications. Through an in-depth study using the density functional theory (DFT) calculations, this work aims to clarify the basic properties of the Mo3P compound at different pressures. In this work, we have studied the structural, elastic, optoelectronic and thermophysical properties of binary Mo3P compound. In this investigation, we varied uniform hydrostatic pressure from 0 GPa to 30 GPa. A complete geometrical optimization for structural parameters is performed and the obtained values are in good accord with the experimental values where available. It is also found that Mo3P possesses very low level of elastic anisotropy, reasonably good machinability, ductile nature, relatively high Vickers hardness, high Debye temperature and high melting temperature. Thermomechanical properties indicate that the compound has potential to be used as a thermal barrier coating material. The bonding nature in Mo3P has been explored. The electronic band structure shows that Mo3P has no band gap and exhibits conventional metallic behavior. All of the energy dependent optical characteristics demonstrate apparent metallic behavior and agree exactly with the electronic density of states calculations. The compound has excellent reflective and absorptive properties suitable for optical applications. Pressure dependent variations of the physical properties are explored and their possible link with superconductivity has been discussed. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2312.05780 [pdf, other]

PULSAR: Graph based Positive Unlabeled Learning with Multi Stream Adaptive Convolutions for Parkinson's Disease Recognition

Authors: Md. Zarif Ul Alam, Md Saiful Islam, Ehsan Hoque, M Saifur Rahman

Abstract: Parkinson's disease (PD) is a neuro-degenerative disorder that affects movement, speech, and coordination. Timely diagnosis and treatment can improve the quality of life for PD patients. However, access to clinical diagnosis is limited in low and middle income countries (LMICs). Therefore, development of automated screening tools for PD can have a huge social impact, particularly in the public hea… ▽ More Parkinson's disease (PD) is a neuro-degenerative disorder that affects movement, speech, and coordination. Timely diagnosis and treatment can improve the quality of life for PD patients. However, access to clinical diagnosis is limited in low and middle income countries (LMICs). Therefore, development of automated screening tools for PD can have a huge social impact, particularly in the public health sector. In this paper, we present PULSAR, a novel method to screen for PD from webcam-recorded videos of the finger-tapping task from the Movement Disorder Society - Unified Parkinson's Disease Rating Scale (MDS-UPDRS). PULSAR is trained and evaluated on data collected from 382 participants (183 self-reported as PD patients). We used an adaptive graph convolutional neural network to dynamically learn the spatio temporal graph edges specific to the finger-tapping task. We enhanced this idea with a multi stream adaptive convolution model to learn features from different modalities of data critical to detect PD, such as relative location of the finger joints, velocity and acceleration of tapping. As the labels of the videos are self-reported, there could be cases of undiagnosed PD in the non-PD labeled samples. We leveraged the idea of Positive Unlabeled (PU) Learning that does not need labeled negative data. Our experiments show clear benefit of modeling the problem in this way. PULSAR achieved 80.95% accuracy in validation set and a mean accuracy of 71.29% (2.49% standard deviation) in independent test, despite being trained with limited amount of data. This is specially promising as labeled data is scarce in health care sector. We hope PULSAR will make PD screening more accessible to everyone. The proposed techniques could be extended for assessment of other movement disorders, such as ataxia, and Huntington's disease. △ Less

Submitted 16 February, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

arXiv:2312.05407 [pdf, other]

Active Learning Guided Federated Online Adaptation: Applications in Medical Image Segmentation

Authors: Md Shazid Islam, Sayak Nag, Arindam Dutta, Miraj Ahmed, Fahim Faisal Niloy, Amit K. Roy-Chowdhury

Abstract: Data privacy, storage, and distribution shifts are major bottlenecks in medical image analysis. Data cannot be shared across patients, physicians, and facilities due to privacy concerns, usually requiring each patient's data to be analyzed in a discreet setting at a near real-time pace. However, one would like to take advantage of the accumulated knowledge across healthcare facilities as the compu… ▽ More Data privacy, storage, and distribution shifts are major bottlenecks in medical image analysis. Data cannot be shared across patients, physicians, and facilities due to privacy concerns, usually requiring each patient's data to be analyzed in a discreet setting at a near real-time pace. However, one would like to take advantage of the accumulated knowledge across healthcare facilities as the computational systems analyze data of more and more patients while incorporating feedback provided by physicians to improve accuracy. Motivated by these, we propose a method for medical image segmentation that adapts to each incoming data batch (online adaptation), incorporates physician feedback through active learning, and assimilates knowledge across facilities in a federated setup. Combining an online adaptation scheme at test time with an efficient sampling strategy with budgeted annotation helps bridge the gap between the source and the incoming stream of target domain data. A federated setup allows collaborative aggregation of knowledge across distinct distributed models without needing to share the data across different models. This facilitates the improvement of performance over time by accumulating knowledge across users. Towards achieving these goals, we propose a computationally amicable, privacy-preserving image segmentation technique \textbf{DrFRODA} that uses federated learning to adapt the model in an online manner with feedback from doctors in the loop. Our experiments on publicly available datasets show that the proposed distributed active learning-based online adaptation method outperforms unsupervised online adaptation methods and shows competitive results with offline active learning-based adaptation methods. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2311.12654 [pdf, other]

PARK: Parkinson's Analysis with Remote Kinetic-tasks

Authors: Md Saiful Islam, Sangwu Lee, Abdelrahman Abdelkader, Sooyong Park, Ehsan Hoque

Abstract: We present a web-based framework to screen for Parkinson's disease (PD) by allowing users to perform neurological tests in their homes. Our web framework guides the users to complete three tasks involving speech, facial expression, and finger movements. The task videos are analyzed to classify whether the users show signs of PD. We present the results in an easy-to-understand manner, along with pe… ▽ More We present a web-based framework to screen for Parkinson's disease (PD) by allowing users to perform neurological tests in their homes. Our web framework guides the users to complete three tasks involving speech, facial expression, and finger movements. The task videos are analyzed to classify whether the users show signs of PD. We present the results in an easy-to-understand manner, along with personalized resources to further access to treatment and care. Our framework is accessible by any major web browser, improving global access to neurological care. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2311.00983 [pdf, other]

Optimizing Inventory Routing: A Decision-Focused Learning Approach using Neural Networks

Authors: MD Shafikul Islam, Azmine Toushik Wasi

Abstract: Inventory Routing Problem (IRP) is a crucial challenge in supply chain management as it involves optimizing efficient route selection while considering the uncertainty of inventory demand planning. To solve IRPs, usually a two-stage approach is employed, where demand is predicted using machine learning techniques first, and then an optimization algorithm is used to minimize routing costs. Our expe… ▽ More Inventory Routing Problem (IRP) is a crucial challenge in supply chain management as it involves optimizing efficient route selection while considering the uncertainty of inventory demand planning. To solve IRPs, usually a two-stage approach is employed, where demand is predicted using machine learning techniques first, and then an optimization algorithm is used to minimize routing costs. Our experiment shows machine learning models fall short of achieving perfect accuracy because inventory levels are influenced by the dynamic business environment, which, in turn, affects the optimization problem in the next stage, resulting in sub-optimal decisions. In this paper, we formulate and propose a decision-focused learning-based approach to solving real-world IRPs. This approach directly integrates inventory prediction and routing optimization within an end-to-end system potentially ensuring a robust supply chain strategy. △ Less

Submitted 2 November, 2023; originally announced November 2023.

Comments: 3 Pages, 2 figures, New in ML Workshop at NeurIPS 2023. Openreview forum: https://openreview.net/forum?id=r0fzjB8f7f&

Journal ref: New in Machine Learning Workshop, NeurIPS 2023

arXiv:2310.19430 [pdf]

Roadmap on Photovoltaic Absorber Materials for Sustainable Energy Conversion

Authors: James C. Blakesley, Ruy S. Bonilla, Marina Freitag, Alex M. Ganose, Nicola Gasparini, Pascal Kaienburg, George Koutsourakis, Jonathan D. Major, Jenny Nelson, Nakita K. Noel, Bart Roose, Jae Sung Yun, Simon Aliwell, Pietro P. Altermatt, Tayebeh Ameri, Virgil Andrei, Ardalan Armin, Diego Bagnis, Jenny Baker, Hamish Beath, Mathieu Bellanger, Philippe Berrouard, Jochen Blumberger, Stuart A. Boden, Hugo Bronstein , et al. (61 additional authors not shown)

Abstract: Photovoltaics (PVs) are a critical technology for curbing growing levels of anthropogenic greenhouse gas emissions, and meeting increases in future demand for low-carbon electricity. In order to fulfil ambitions for net-zero carbon dioxide equivalent (CO<sub>2</sub>eq) emissions worldwide, the global cumulative capacity of solar PVs must increase by an order of magnitude from 0.9 TWp in 2021 to 8.… ▽ More Photovoltaics (PVs) are a critical technology for curbing growing levels of anthropogenic greenhouse gas emissions, and meeting increases in future demand for low-carbon electricity. In order to fulfil ambitions for net-zero carbon dioxide equivalent (CO<sub>2</sub>eq) emissions worldwide, the global cumulative capacity of solar PVs must increase by an order of magnitude from 0.9 TWp in 2021 to 8.5 TWp by 2050 according to the International Renewable Energy Agency, which is considered to be a highly conservative estimate. In 2020, the Henry Royce Institute brought together the UK PV community to discuss the critical technological and infrastructure challenges that need to be overcome to address the vast challenges in accelerating PV deployment. Herein, we examine the key developments in the global community, especially the progress made in the field since this earlier roadmap, bringing together experts primarily from the UK across the breadth of the photovoltaics community. The focus is both on the challenges in improving the efficiency, stability and levelized cost of electricity of current technologies for utility-scale PVs, as well as the fundamental questions in novel technologies that can have a significant impact on emerging markets, such as indoor PVs, space PVs, and agrivoltaics. We discuss challenges in advanced metrology and computational tools, as well as the growing synergies between PVs and solar fuels, and offer a perspective on the environmental sustainability of the PV industry. Through this roadmap, we emphasize promising pathways forward in both the short- and long-term, and for communities working on technologies across a range of maturity levels to learn from each other. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 160 pages, 21 figures

arXiv:2310.06848 [pdf, other]

DeepTriNet: A Tri-Level Attention Based DeepLabv3+ Architecture for Semantic Segmentation of Satellite Images

Authors: Tareque Bashar Ovi, Shakil Mosharrof, Nomaiya Bashree, Md Shofiqul Islam, Muhammad Nazrul Islam

Abstract: The segmentation of satellite images is crucial in remote sensing applications. Existing methods face challenges in recognizing small-scale objects in satellite images for semantic segmentation primarily due to ignoring the low-level characteristics of the underlying network and due to containing distinct amounts of information by different feature maps. Thus, in this research, a tri-level attenti… ▽ More The segmentation of satellite images is crucial in remote sensing applications. Existing methods face challenges in recognizing small-scale objects in satellite images for semantic segmentation primarily due to ignoring the low-level characteristics of the underlying network and due to containing distinct amounts of information by different feature maps. Thus, in this research, a tri-level attention-based DeepLabv3+ architecture (DeepTriNet) is proposed for the semantic segmentation of satellite images. The proposed hybrid method combines squeeze-and-excitation networks (SENets) and tri-level attention units (TAUs) with the vanilla DeepLabv3+ architecture, where the TAUs are used to bridge the semantic feature gap among encoders output and the SENets used to put more weight on relevant features. The proposed DeepTriNet finds which features are the more relevant and more generalized way by its self-supervision rather we annotate them. The study showed that the proposed DeepTriNet performs better than many conventional techniques with an accuracy of 98% and 77%, IoU 80% and 58%, precision 88% and 68%, and recall of 79% and 55% on the 4-class Land-Cover.ai dataset and the 15-class GID-2 dataset respectively. The proposed method will greatly contribute to natural resource management and change detection in rural and urban regions through efficient and semantic satellite image segmentation △ Less

Submitted 5 September, 2023; originally announced October 2023.

Comments: Keywords: Attention mechanism, Deep learning, Satellite image, DeepLabv3+, Segmentation

arXiv:2310.05768 [pdf]

doi 10.1109/ICCD59681.2023.10420622

DANet: Enhancing Small Object Detection through an Efficient Deformable Attention Network

Authors: Md Sohag Mia, Abdullah Al Bary Voban, Abu Bakor Hayat Arnob, Abdu Naim, Md Kawsar Ahmed, Md Shariful Islam

Abstract: Efficient and accurate detection of small objects in manufacturing settings, such as defects and cracks, is crucial for ensuring product quality and safety. To address this issue, we proposed a comprehensive strategy by synergizing Faster R-CNN with cutting-edge methods. By combining Faster R-CNN with Feature Pyramid Network, we enable the model to efficiently handle multi-scale features intrinsic… ▽ More Efficient and accurate detection of small objects in manufacturing settings, such as defects and cracks, is crucial for ensuring product quality and safety. To address this issue, we proposed a comprehensive strategy by synergizing Faster R-CNN with cutting-edge methods. By combining Faster R-CNN with Feature Pyramid Network, we enable the model to efficiently handle multi-scale features intrinsic to manufacturing environments. Additionally, Deformable Net is used that contorts and conforms to the geometric variations of defects, bringing precision in detecting even the minuscule and complex features. Then, we incorporated an attention mechanism called Convolutional Block Attention Module in each block of our base ResNet50 network to selectively emphasize informative features and suppress less useful ones. After that we incorporated RoI Align, replacing RoI Pooling for finer region-of-interest alignment and finally the integration of Focal Loss effectively handles class imbalance, crucial for rare defect occurrences. The rigorous evaluation of our model on both the NEU-DET and Pascal VOC datasets underscores its robust performance and generalization capabilities. On the NEU-DET dataset, our model exhibited a profound understanding of steel defects, achieving state-of-the-art accuracy in identifying various defects. Simultaneously, when evaluated on the Pascal VOC dataset, our model showcases its ability to detect objects across a wide spectrum of categories within complex and small scenes. △ Less

Submitted 13 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: ICCD-23

Report number: 10.1109/ICCD59681.2023

Journal ref: International Conference on the Cognitive Computing and Complex Data (ICCD) 2023

arXiv:2310.05664 [pdf]

doi 10.1109/ICCD59681.2023.10420683

ViTs are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different Domain

Authors: Md Sohag Mia, Abu Bakor Hayat Arnob, Abdu Naim, Abdullah Al Bary Voban, Md Shariful Islam

Abstract: Transformer design is the de facto standard for natural language processing tasks. The success of the transformer design in natural language processing has lately piqued the interest of researchers in the domain of computer vision. When compared to Convolutional Neural Networks (CNNs), Vision Transformers (ViTs) are becoming more popular and dominant solutions for many vision problems. Transformer… ▽ More Transformer design is the de facto standard for natural language processing tasks. The success of the transformer design in natural language processing has lately piqued the interest of researchers in the domain of computer vision. When compared to Convolutional Neural Networks (CNNs), Vision Transformers (ViTs) are becoming more popular and dominant solutions for many vision problems. Transformer-based models outperform other types of networks, such as convolutional and recurrent neural networks, in a range of visual benchmarks. We evaluate various vision transformer models in this work by dividing them into distinct jobs and examining their benefits and drawbacks. ViTs can overcome several possible difficulties with convolutional neural networks (CNNs). The goal of this survey is to show the first use of ViTs in CV. In the first phase, we categorize various CV applications where ViTs are appropriate. Image classification, object identification, image segmentation, video transformer, image denoising, and NAS are all CV applications. Our next step will be to analyze the state-of-the-art in each area and identify the models that are currently available. In addition, we outline numerous open research difficulties as well as prospective research possibilities. △ Less

Submitted 13 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: ICCD-2023. arXiv admin note: substantial text overlap with arXiv:2208.04309 by other authors

Journal ref: International Conference on the Cognitive Computing and Complex Data (ICCD) 2023

arXiv:2309.16151 [pdf]

Ab-initio insights into the physical properties of XIr3 (X = La, Th) superconductors: A comparative analysis

Authors: Md. Sajidul Islam, Razu Ahmed, M. M. Hossain, M. A. Ali, M. M. Uddin, S. H. Naqib

Abstract: Here we report the structural, elastic, bonding, thermo-mechanical, optoelectronic and superconducting state properties of recently discovered XIr3 (X = La, Th) superconductors utilizing the density functional theory (DFT). The elastic, bonding, thermal and optical properties of these compounds are investigated for the first time. The calculated lattice and superconducting state parameters are in… ▽ More Here we report the structural, elastic, bonding, thermo-mechanical, optoelectronic and superconducting state properties of recently discovered XIr3 (X = La, Th) superconductors utilizing the density functional theory (DFT). The elastic, bonding, thermal and optical properties of these compounds are investigated for the first time. The calculated lattice and superconducting state parameters are in reasonable agreement to those found in the literature. In the ground state, both the compounds are mechanically stable and possess highly ductile character, high machinability, low Debye temperature, low bond hardness and significantly high melting point. The thermal conductivities of the compounds are found to be very low which suggests that they can be used for thermal insulation purpose. The population analysis and charge density distribution map confirm the presence of both ionic and covalent bonds in the compounds with ionic bond playing dominant roles. The calculated band structure and DOS profiles indicate metallic character. Unlike the significant anisotropy observed in elastic and thermal properties, all the optical constants of these compounds exhibit almost isotropic behavior. The optical constants correspond very well with the electronic band structure and DOS features. We have estimated the superconducting transition temperature of the compounds in this work. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.15132 [pdf, other]

Genetic InfoMax: Exploring Mutual Information Maximization in High-Dimensional Imaging Genetics Studies

Authors: Yaochen Xie, Ziqian Xie, Sheikh Muhammad Saiful Islam, Degui Zhi, Shuiwang Ji

Abstract: Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits. When applied to high-dimensional medical imaging data, a key step is to extract lower-dimensional, yet informative representations of the data as traits. Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS in compari… ▽ More Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits. When applied to high-dimensional medical imaging data, a key step is to extract lower-dimensional, yet informative representations of the data as traits. Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS in comparison to typical visual representation learning. In this study, we tackle this problem from the mutual information (MI) perspective by identifying key limitations of existing methods. We introduce a trans-modal learning framework Genetic InfoMax (GIM), including a regularized MI estimator and a novel genetics-informed transformer to address the specific challenges of GWAS. We evaluate GIM on human brain 3D MRI data and establish standardized evaluation protocols to compare it to existing approaches. Our results demonstrate the effectiveness of GIM and a significantly improved performance on GWAS. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 17 pages, 7 figures

arXiv:2309.15023 [pdf]

Large-Scale Statistical Analysis of Defect Emission in hBN: Revealing Spectral Families and Influence of Flakes Morphology

Authors: M. S. Islam, R. K. Chowdhury, M. Barthelemy, L. Moczko, P. Hebraud, S. Berciaud, A. Barsella, F. Fras

Abstract: Quantum emitters in two-dimensional layered hexagonal boron nitride are quickly emerging as a highly promising platform for next-generation quantum technologies. However, precise identification and control of defects are key parameters to achieve the next step in their development. We conducted a comprehensive study by analyzing over 10,000 photoluminescence emission lines, revealing 11 distinct d… ▽ More Quantum emitters in two-dimensional layered hexagonal boron nitride are quickly emerging as a highly promising platform for next-generation quantum technologies. However, precise identification and control of defects are key parameters to achieve the next step in their development. We conducted a comprehensive study by analyzing over 10,000 photoluminescence emission lines, revealing 11 distinct defect families within the 1.6 to 2.2 eV energy range. This challenges hypotheses of a random energy distribution. We also reported averaged defect parameters, including emission linewidths, spatial density, phonon side bands, and the Debye-Waller factors. These findings provide valuable insights to decipher the microscopic origin of emitters in hBN hosts. We also explored the influence of hBN host morphology on defect family formation, demonstrating its crucial impact. By tuning flake size and arrangement we achieve selective control of defect types while maintaining high spatial density. This offers a scalable approach to defect emission control, diverging from costly engineering methods. It highlights the importance of investigating flake morphological control to gain deeper insights into the origins of defects and to expand the spectral tailoring capabilities of defects in hBN. △ Less

Submitted 26 September, 2023; originally announced September 2023.

arXiv:2309.13173 [pdf, other]

BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP

Authors: Mohsinul Kabir, Mohammed Saidul Islam, Md Tahmid Rahman Laskar, Mir Tafseer Nayeem, M Saiful Bari, Enamul Hoque

Abstract: Large Language Models (LLMs) have emerged as one of the most important breakthroughs in NLP for their impressive skills in language generation and other language-specific tasks. Though LLMs have been evaluated in various tasks, mostly in English, they have not yet undergone thorough evaluation in under-resourced languages such as Bengali (Bangla). To this end, this paper introduces BenLLM-Eval, wh… ▽ More Large Language Models (LLMs) have emerged as one of the most important breakthroughs in NLP for their impressive skills in language generation and other language-specific tasks. Though LLMs have been evaluated in various tasks, mostly in English, they have not yet undergone thorough evaluation in under-resourced languages such as Bengali (Bangla). To this end, this paper introduces BenLLM-Eval, which consists of a comprehensive evaluation of LLMs to benchmark their performance in the Bengali language that has modest resources. In this regard, we select various important and diverse Bengali NLP tasks, such as text summarization, question answering, paraphrasing, natural language inference, transliteration, text classification, and sentiment analysis for zero-shot evaluation of popular LLMs, namely, GPT-3.5, LLaMA-2-13b-chat, and Claude-2. Our experimental results demonstrate that while in some Bengali NLP tasks, zero-shot LLMs could achieve performance on par, or even better than current SOTA fine-tuned models; in most tasks, their performance is quite poor (with the performance of open-source LLMs like LLaMA-2-13b-chat being significantly bad) in comparison to the current SOTA results. Therefore, it calls for further efforts to develop a better understanding of LLMs in modest-resourced languages like Bengali. △ Less

Submitted 19 March, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: Accepted by LREC-COLING 2024. The first two authors contributed equally

arXiv:2309.12350 [pdf, other]

Exploring Internet of Things Adoption Challenges in Manufacturing Firms: A Delphi Fuzzy Analytical Hierarchy Process Approach

Authors: Hasan Shahriar, Md. Saiful Islam, Md Abrar Jahin, Istiyaque Ahmed Ridoy, Raihan Rafi Prottoy, Adiba Abid, M. F. Mridha

Abstract: Innovation is crucial for sustainable success in today's fiercely competitive global manufacturing landscape. Bangladesh's manufacturing sector must embrace transformative technologies like the Internet of Things (IoT) to thrive in this environment. This article addresses the vital task of identifying and evaluating barriers to IoT adoption in Bangladesh's manufacturing industry. Through synthesiz… ▽ More Innovation is crucial for sustainable success in today's fiercely competitive global manufacturing landscape. Bangladesh's manufacturing sector must embrace transformative technologies like the Internet of Things (IoT) to thrive in this environment. This article addresses the vital task of identifying and evaluating barriers to IoT adoption in Bangladesh's manufacturing industry. Through synthesizing expert insights and carefully reviewing contemporary literature, we explore the intricate landscape of IoT adoption challenges. Our methodology combines the Delphi and Fuzzy Analytical Hierarchy Process, systematically analyzing and prioritizing these challenges. This approach harnesses expert knowledge and uses fuzzy logic to handle uncertainties. Our findings highlight key obstacles, with "Lack of top management commitment to new technology" (B10), "High initial implementation costs" (B9), and "Risks in adopting a new business model" (B7) standing out as significant challenges that demand immediate attention. These insights extend beyond academia, offering practical guidance to industry leaders. With the knowledge gained from this study, managers can develop tailored strategies, set informed priorities, and embark on a transformative journey toward leveraging IoT's potential in Bangladesh's industrial sector. This article provides a comprehensive understanding of IoT adoption challenges and equips industry leaders to navigate them effectively. This strategic navigation, in turn, enhances the competitiveness and sustainability of Bangladesh's manufacturing sector in the IoT era. △ Less

Submitted 11 December, 2023; v1 submitted 30 August, 2023; originally announced September 2023.

arXiv:2309.11157 [pdf, other]

Learning Deformable 3D Graph Similarity to Track Plant Cells in Unregistered Time Lapse Images

Authors: Md Shazid Islam, Arindam Dutta, Calvin-Khang Ta, Kevin Rodriguez, Christian Michael, Mark Alber, G. Venugopala Reddy, Amit K. Roy-Chowdhury

Abstract: Tracking of plant cells in images obtained by microscope is a challenging problem due to biological phenomena such as large number of cells, non-uniform growth of different layers of the tightly packed plant cells and cell division. Moreover, images in deeper layers of the tissue being noisy and unavoidable systemic errors inherent in the imaging process further complicates the problem. In this pa… ▽ More Tracking of plant cells in images obtained by microscope is a challenging problem due to biological phenomena such as large number of cells, non-uniform growth of different layers of the tightly packed plant cells and cell division. Moreover, images in deeper layers of the tissue being noisy and unavoidable systemic errors inherent in the imaging process further complicates the problem. In this paper, we propose a novel learning-based method that exploits the tightly packed three-dimensional cell structure of plant cells to create a three-dimensional graph in order to perform accurate cell tracking. We further propose novel algorithms for cell division detection and effective three-dimensional registration, which improve upon the state-of-the-art algorithms. We demonstrate the efficacy of our algorithm in terms of tracking accuracy and inference-time on a benchmark dataset. △ Less

Submitted 21 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2308.03741 [pdf, other]

MAiVAR-T: Multimodal Audio-image and Video Action Recognizer using Transformers

Authors: Muhammad Bilal Shaikh, Douglas Chai, Syed Mohammed Shamsul Islam, Naveed Akhtar

Abstract: In line with the human capacity to perceive the world by simultaneously processing and integrating high-dimensional inputs from multiple modalities like vision and audio, we propose a novel model, MAiVAR-T (Multimodal Audio-Image to Video Action Recognition Transformer). This model employs an intuitive approach for the combination of audio-image and video modalities, with a primary aim to escalate… ▽ More In line with the human capacity to perceive the world by simultaneously processing and integrating high-dimensional inputs from multiple modalities like vision and audio, we propose a novel model, MAiVAR-T (Multimodal Audio-Image to Video Action Recognition Transformer). This model employs an intuitive approach for the combination of audio-image and video modalities, with a primary aim to escalate the effectiveness of multimodal human action recognition (MHAR). At the core of MAiVAR-T lies the significance of distilling substantial representations from the audio modality and transmuting these into the image domain. Subsequently, this audio-image depiction is fused with the video modality to formulate a unified representation. This concerted approach strives to exploit the contextual richness inherent in both audio and video modalities, thereby promoting action recognition. In contrast to existing state-of-the-art strategies that focus solely on audio or video modalities, MAiVAR-T demonstrates superior performance. Our extensive empirical evaluations conducted on a benchmark action recognition dataset corroborate the model's remarkable performance. This underscores the potential enhancements derived from integrating audio and video modalities for action recognition purposes. △ Less

Submitted 1 August, 2023; originally announced August 2023.

Comments: 6 pages, 7 figures, 4 tables, Peer reviewed, Accepted @ The 11th European Workshop on Visual Information Processing (EUVIP) will be held on 11th-14th September 2023, in Gjøvik, Norway. arXiv admin note: text overlap with arXiv:2103.15691 by other authors

arXiv:2308.02588 [pdf, other]

Unmasking Parkinson's Disease with Smile: An AI-enabled Screening Framework

Authors: Tariq Adnan, Md Saiful Islam, Wasifur Rahman, Sangwu Lee, Sutapa Dey Tithi, Kazi Noshin, Imran Sarker, M Saifur Rahman, Ehsan Hoque

Abstract: Parkinson's disease (PD) diagnosis remains challenging due to lacking a reliable biomarker and limited access to clinical care. In this study, we present an analysis of the largest video dataset containing micro-expressions to screen for PD. We collected 3,871 videos from 1,059 unique participants, including 256 self-reported PD patients. The recordings are from diverse sources encompassing partic… ▽ More Parkinson's disease (PD) diagnosis remains challenging due to lacking a reliable biomarker and limited access to clinical care. In this study, we present an analysis of the largest video dataset containing micro-expressions to screen for PD. We collected 3,871 videos from 1,059 unique participants, including 256 self-reported PD patients. The recordings are from diverse sources encompassing participants' homes across multiple countries, a clinic, and a PD care facility in the US. Leveraging facial landmarks and action units, we extracted features relevant to Hypomimia, a prominent symptom of PD characterized by reduced facial expressions. An ensemble of AI models trained on these features achieved an accuracy of 89.7% and an Area Under the Receiver Operating Characteristic (AUROC) of 89.3% while being free from detectable bias across population subgroups based on sex and ethnicity on held-out data. Further analysis reveals that features from the smiling videos alone lead to comparable performance, even on two external test sets the model has never seen during training, suggesting the potential for PD risk assessment from smiling selfie videos. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2307.12906 [pdf, other]

QAmplifyNet: Pushing the Boundaries of Supply Chain Backorder Prediction Using Interpretable Hybrid Quantum-Classical Neural Network

Authors: Md Abrar Jahin, Md Sakib Hossain Shovon, Md. Saiful Islam, Jungpil Shin, M. F. Mridha, Yuichi Okuyama

Abstract: Supply chain management relies on accurate backorder prediction for optimizing inventory control, reducing costs, and enhancing customer satisfaction. However, traditional machine-learning models struggle with large-scale datasets and complex relationships, hindering real-world data collection. This research introduces a novel methodological framework for supply chain backorder prediction, address… ▽ More Supply chain management relies on accurate backorder prediction for optimizing inventory control, reducing costs, and enhancing customer satisfaction. However, traditional machine-learning models struggle with large-scale datasets and complex relationships, hindering real-world data collection. This research introduces a novel methodological framework for supply chain backorder prediction, addressing the challenge of handling large datasets. Our proposed model, QAmplifyNet, employs quantum-inspired techniques within a quantum-classical neural network to predict backorders effectively on short and imbalanced datasets. Experimental evaluations on a benchmark dataset demonstrate QAmplifyNet's superiority over classical models, quantum ensembles, quantum neural networks, and deep reinforcement learning. Its proficiency in handling short, imbalanced datasets makes it an ideal solution for supply chain management. To enhance model interpretability, we use Explainable Artificial Intelligence techniques. Practical implications include improved inventory control, reduced backorders, and enhanced operational efficiency. QAmplifyNet seamlessly integrates into real-world supply chain management systems, enabling proactive decision-making and efficient resource allocation. Future work involves exploring additional quantum-inspired techniques, expanding the dataset, and investigating other supply chain applications. This research unlocks the potential of quantum computing in supply chain optimization and paves the way for further exploration of quantum-inspired machine learning models in supply chain management. Our framework and QAmplifyNet model offer a breakthrough approach to supply chain backorder prediction, providing superior performance and opening new avenues for leveraging quantum-inspired techniques in supply chain management. △ Less

Submitted 15 October, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

arXiv:2307.05220 [pdf]

Comprehensive first-principles insights into the physical properties of intermetallic Zr$_3$Ir: a noncentrosymmetric superconductor

Authors: Razu Ahmed, Md. Sajidul Islam, M. M. Hossain, M. A. Ali, M. M. Uddin, S. H. Naqib

Abstract: We have looked into the structural, mechanical, optoelectronic, superconducting state and thermophysical aspects of intermetallic compound Zr$_3$Ir using the density functional theory (DFT). Many of the physical properties, including direction dependent mechanical properties, Vickers hardness, optical properties, chemical bonding nature, and charge density distributions, are being investigated for… ▽ More We have looked into the structural, mechanical, optoelectronic, superconducting state and thermophysical aspects of intermetallic compound Zr$_3$Ir using the density functional theory (DFT). Many of the physical properties, including direction dependent mechanical properties, Vickers hardness, optical properties, chemical bonding nature, and charge density distributions, are being investigated for the first time. According to this study, Zr$_3$Ir exhibits ductile features, high machinability, significant metallic bonding, a low Vickers hardness with low Debye temperature, and a modest level of elastic anisotropy. The mechanical and dynamical stabilities of Zr$_3$Ir have been confirmed. The metallic nature of Zr$_3$Ir is seen in the electronic band structures with a high electronic energy density of states at the Fermi level. The bonding nature has been explored by the charge density mapping and bond population analysis. The tetragonal Zr$_3$Ir shows a remarkable electronic stability, as confirmed by the presence of a pseudogap in the electronic energy density of states at the Fermi level between the bonding and antibonding states. Optical parameters show very good agreement with the electronic properties. The reflectivity spectra reveal that Zr$_3$Ir is a good reflector in the infrared and near-visible regions. Zr$_3$Ir is an excellent ultra-violet (UV) radiation absorber. High refractive index at visible photon energies indicates that Zr$_3$Ir could be used to improve the visual aspects of electronic displays. All the optical constants exhibit a moderate degree of anisotropy. Zr$_3$Ir has a moderate melting point, high damage tolerance, and very low minimum thermal conductivity. The thermomechanical characteristics of Zr$_3$Ir reveal that it is a potential thermal barrier coating material. The superconducting state parameters of Zr$_3$Ir are also explored. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Showing 1–50 of 212 results for author: Islam, M S