Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 191 results for author: Khan, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10889  [pdf, other

    cs.CV cs.AI cs.LG

    VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time?

    Authors: Darshana Saravanan, Darshan Singh, Varun Gupta, Zeeshan Khan, Vineet Gandhi, Makarand Tapaswi

    Abstract: Compositionality is a fundamental aspect of vision-language understanding and is especially required for videos since they contain multiple entities (e.g. persons, actions, and scenes) interacting dynamically over time. Existing benchmarks focus primarily on perception capabilities. However, they do not study binding, the ability of a model to associate entities through appropriate relationships.… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 26 pages, 17 figures, 3 tables

  2. arXiv:2405.17788  [pdf, other

    cs.CV

    Enhancing Road Safety: Real-Time Detection of Driver Distraction through Convolutional Neural Networks

    Authors: Amaan Aijaz Sheikh, Imaad Zaffar Khan

    Abstract: As we navigate our daily commutes, the threat posed by a distracted driver is at a large, resulting in a troubling rise in traffic accidents. Addressing this safety concern, our project harnesses the analytical power of Convolutional Neural Networks (CNNs), with a particular emphasis on the well-established models VGG16 and VGG19. These models are acclaimed for their precision in image recognition… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2405.13949  [pdf, other

    cs.CV

    PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery

    Authors: Runlong He, Mengya Xu, Adrito Das, Danyal Z. Khan, Sophia Bano, Hani J. Marcus, Danail Stoyanov, Matthew J. Clarkson, Mobarakol Islam

    Abstract: Visual Question Answering (VQA) within the surgical domain, utilizing Large Language Models (LLMs), offers a distinct opportunity to improve intra-operative decision-making and facilitate intuitive surgeon-AI interaction. However, the development of LLMs for surgical VQA is hindered by the scarcity of diverse and extensive datasets with complex reasoning tasks. Moreover, contextual fusion of the i… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 10 pages, 3 figures

  4. arXiv:2405.11483  [pdf, other

    cs.CV

    MICap: A Unified Model for Identity-aware Movie Descriptions

    Authors: Haran Raajesh, Naveen Reddy Desanur, Zeeshan Khan, Makarand Tapaswi

    Abstract: Characters are an important aspect of any storyline and identifying and including them in descriptions is necessary for story understanding. While previous work has largely ignored identity and generated captions with someone (anonymized names), recent work formulates id-aware captioning as a fill-in-the-blanks (FITB) task, where, given a caption with blanks, the goal is to predict person id label… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: CVPR 2024, Project Page: https://katha-ai.github.io/projects/micap/

  5. arXiv:2404.10193  [pdf, other

    cs.CV

    Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering

    Authors: Zaid Khan, Yun Fu

    Abstract: The goal of selective prediction is to allow an a model to abstain when it may not be able to deliver a reliable prediction, which is important in safety-critical contexts. Existing approaches to selective prediction typically require access to the internals of a model, require retraining a model or study only unimodal models. However, the most powerful models (e.g. GPT-4) are typically only avail… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  6. arXiv:2404.04627  [pdf, other

    cs.CV

    Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement

    Authors: Zaid Khan, Vijay Kumar BG, Samuel Schulter, Yun Fu, Manmohan Chandraker

    Abstract: Visual program synthesis is a promising approach to exploit the reasoning abilities of large language models for compositional computer vision tasks. Previous work has used few-shot prompting with frozen LLMs to synthesize visual programs. Training an LLM to write better visual programs is an attractive prospect, but it is unclear how to accomplish this. No dataset of visual programs for training… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  7. arXiv:2403.09715  [pdf, other

    cs.SE cs.CL cs.CR cs.LG

    Textual analysis of End User License Agreement for red-flagging potentially malicious software

    Authors: Behraj Khan, Tahir Syed, Zeshan Khan, Muhammad Rafi

    Abstract: New software and updates are downloaded by end users every day. Each dowloaded software has associated with it an End Users License Agreements (EULA), but this is rarely read. An EULA includes information to avoid legal repercussions. However,this proposes a host of potential problems such as spyware or producing an unwanted affect in the target system. End users do not read these EULA's because o… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  8. arXiv:2402.05126  [pdf, other

    cs.CL cs.LG

    Graph Neural Network and NER-Based Text Summarization

    Authors: Imaad Zaffar Khan, Amaan Aijaz Sheikh, Utkarsh Sinha

    Abstract: With the abundance of data and information in todays time, it is nearly impossible for man, or, even machine, to go through all of the data line by line. What one usually does is to try to skim through the lines and retain the absolutely important information, that in a more formal term is called summarization. Text summarization is an important task that aims to compress lengthy documents or arti… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  9. arXiv:2401.12667  [pdf, ps, other

    stat.ML cs.LG

    Feature Selection via Robust Weighted Score for High Dimensional Binary Class-Imbalanced Gene Expression Data

    Authors: Zardad Khan, Amjad Ali, Saeed Aldahmani

    Abstract: In this paper, a robust weighted score for unbalanced data (ROWSU) is proposed for selecting the most discriminative feature for high dimensional gene expression binary classification with class-imbalance problem. The method addresses one of the most challenging problems of highly skewed class distributions in gene expression datasets that adversely affect the performance of classification algorit… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 25 pages

    MSC Class: 14J60

  10. arXiv:2401.07669  [pdf, other

    cs.CV

    FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos

    Authors: Darshan Singh S, Zeeshan Khan, Makarand Tapaswi

    Abstract: While contrastive language image pretraining (CLIP) have exhibited impressive performance by learning highly semantic and generalized representations, recent works have exposed a fundamental drawback in its syntactic properties, that includes interpreting fine-grained attributes, actions, spatial relations, states, and details that require compositional reasoning. One reason for this is that natur… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  11. arXiv:2311.09762  [pdf, other

    cs.CL cs.AI cs.LG

    Graph Elicitation for Guiding Multi-Step Reasoning in Large Language Models

    Authors: Jinyoung Park, Ameen Patel, Omar Zia Khan, Hyunwoo J. Kim, Joo-Kyung Kim

    Abstract: Chain-of-Thought (CoT) prompting along with sub-question generation and answering has enhanced multi-step reasoning capabilities of Large Language Models (LLMs). However, prompting the LLMs to directly generate sub-questions is suboptimal since they sometimes generate redundant or irrelevant questions. To deal with them, we propose a GE-Reasoning method, which directs LLMs to generate proper sub-q… ▽ More

    Submitted 22 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Preprint

  12. arXiv:2310.20081  [pdf, other

    cs.CL cs.AI cs.IR

    Integrating Summarization and Retrieval for Enhanced Personalization via Large Language Models

    Authors: Chris Richardson, Yao Zhang, Kellen Gillespie, Sudipta Kar, Arshdeep Singh, Zeynab Raeesy, Omar Zia Khan, Abhinav Sethy

    Abstract: Personalization, the ability to tailor a system to individual users, is an essential factor in user experience with natural language processing (NLP) systems. With the emergence of Large Language Models (LLMs), a key question is how to leverage these models to better personalize user experiences. To personalize a language model's output, a straightforward approach is to incorporate past user data… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 4 pages, International Workshop on Personalized Generative AI (@CIKM 2023)

    ACM Class: I.2.7; H.3.3

  13. arXiv:2310.17954  [pdf, other

    eess.IV cs.CV

    Multivessel Coronary Artery Segmentation and Stenosis Localisation using Ensemble Learning

    Authors: Muhammad Bilal, Dinis Martinho, Reiner Sim, Adnan Qayyum, Hunaid Vohra, Massimo Caputo, Taofeek Akinosho, Sofiat Abioye, Zaheer Khan, Waleed Niaz, Junaid Qadir

    Abstract: Coronary angiography analysis is a common clinical task performed by cardiologists to diagnose coronary artery disease (CAD) through an assessment of atherosclerotic plaque's accumulation. This study introduces an end-to-end machine learning solution developed as part of our solution for the MICCAI 2023 Automatic Region-based Coronary Artery Disease diagnostics using x-ray angiography imagEs (ARCA… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Submission report for ARCADE challenge hosted at MICCAI2023

  14. arXiv:2310.17050  [pdf, other

    cs.CV

    Exploring Question Decomposition for Zero-Shot VQA

    Authors: Zaid Khan, Vijay Kumar BG, Samuel Schulter, Manmohan Chandraker, Yun Fu

    Abstract: Visual question answering (VQA) has traditionally been treated as a single-step task where each question receives the same amount of effort, unlike natural human question-answering strategies. We explore a question decomposition strategy for VQA to overcome this limitation. We probe the ability of recently developed large vision-language models to use human-written decompositions and produce their… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 Camera Ready

  15. arXiv:2310.17032  [pdf, other

    quant-ph cs.LG

    Quantum Long Short-Term Memory (QLSTM) vs Classical LSTM in Time Series Forecasting: A Comparative Study in Solar Power Forecasting

    Authors: Saad Zafar Khan, Nazeefa Muzammil, Salman Ghafoor, Haibat Khan, Syed Mohammad Hasan Zaidi, Abdulah Jeza Aljohani, Imran Aziz

    Abstract: Accurate solar power forecasting is pivotal for the global transition towards sustainable energy systems. This study conducts a meticulous comparison between Quantum Long Short-Term Memory (QLSTM) and classical Long Short-Term Memory (LSTM) models for solar power production forecasting. The primary objective is to evaluate the potential advantages of QLSTMs, leveraging their exponential representa… ▽ More

    Submitted 9 April, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: 33 pages, 9 figures

  16. arXiv:2308.15827  [pdf, other

    cs.CV

    Introducing Language Guidance in Prompt-based Continual Learning

    Authors: Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc Van Gool, Didier Stricker, Federico Tombari, Muhammad Zeshan Afzal

    Abstract: Continual Learning aims to learn a single model on a sequence of tasks without having access to data from previous tasks. The biggest challenge in the domain still remains catastrophic forgetting: a loss in performance on seen classes of earlier tasks. Some existing methods rely on an expensive replay buffer to store a chunk of data from previous tasks. This, while promising, becomes expensive whe… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV 2023

  17. arXiv:2307.16262  [pdf, other

    eess.IV cs.CV

    Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

    Authors: Debesh Jha, Vanshali Sharma, Debapriya Banik, Debayan Bhattacharya, Kaushiki Roy, Steven A. Hicks, Nikhil Kumar Tomar, Vajira Thambawita, Adrian Krenzer, Ge-Peng Ji, Sahadev Poudel, George Batchkala, Saruar Alam, Awadelrahman M. A. Ahmed, Quoc-Huy Trinh, Zeshan Khan, Tien-Phat Nguyen, Shruti Shrestha, Sabari Nathan, Jeonghwan Gwak, Ritika K. Jha, Zheyuan Zhang, Alexander Schlaefer, Debotosh Bhattacharjee, M. K. Bhuyan , et al. (8 additional authors not shown)

    Abstract: Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 July, 2023; originally announced July 2023.

  18. arXiv:2306.03932  [pdf, other

    cs.CV

    Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!

    Authors: Zaid Khan, Vijay Kumar BG, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker

    Abstract: Finetuning a large vision language model (VLM) on a target dataset after large scale pretraining is a dominant paradigm in visual question answering (VQA). Datasets for specialized tasks such as knowledge-based VQA or VQA in non natural-image domains are orders of magnitude smaller than those for general-purpose VQA. While collecting additional labels for specialized tasks or domains can be challe… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: CVPR 2023

  19. arXiv:2306.01819  [pdf

    cs.PL cs.AI

    Comparative Analysis of Widely use Object-Oriented Languages

    Authors: Muhammad Shoaib Farooq, Taymour zaman Khan

    Abstract: Programming is an integral part of computer science discipline. Every day the programming environment is not only rapidly growing but also changing and languages are constantly evolving. Learning of object-oriented paradigm is compulsory in every computer science major so the choice of language to teach object-oriented principles is very important. Due to large pool of object-oriented languages, i… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 30 pages, figures 2

  20. arXiv:2305.15897  [pdf, other

    cs.SE

    Impact of Log Parsing on Log-based Anomaly Detection

    Authors: Zanis Ali Khan, Donghwan Shin, Domenico Bianculli, Lionel Briand

    Abstract: Software systems log massive amounts of data, recording important runtime information. Such logs are used, for example, for log-based anomaly detection, which aims to automatically detect abnormal behaviors of the system under analysis by processing the information recorded in its logs. Many log-based anomaly detection techniques based on deep-learning models include a pre-processing step called l… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  21. arXiv:2305.06934  [pdf, other

    cs.SE cs.AI cs.CL cs.CY cs.LG cs.PL

    Humans are Still Better than ChatGPT: Case of the IEEEXtreme Competition

    Authors: Anis Koubaa, Basit Qureshi, Adel Ammar, Zahid Khan, Wadii Boulila, Lahouari Ghouti

    Abstract: Since the release of ChatGPT, numerous studies have highlighted the remarkable performance of ChatGPT, which often rivals or even surpasses human capabilities in various tasks and domains. However, this paper presents a contrasting perspective by demonstrating an instance where human performance excels in typical tasks suited for ChatGPT, specifically in the domain of computer programming. We util… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 9 pages, 3 figures

  22. arXiv:2304.09756  [pdf, other

    cs.LG cs.AI cs.HC eess.SP

    Contactless Human Activity Recognition using Deep Learning with Flexible and Scalable Software Define Radio

    Authors: Muhammad Zakir Khan, Jawad Ahmad, Wadii Boulila, Matthew Broadbent, Syed Aziz Shah, Anis Koubaa, Qammer H. Abbasi

    Abstract: Ambient computing is gaining popularity as a major technological advancement for the future. The modern era has witnessed a surge in the advancement in healthcare systems, with viable radio frequency solutions proposed for remote and unobtrusive human activity recognition (HAR). Specifically, this study investigates the use of Wi-Fi channel state information (CSI) as a novel method of ambient sens… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  23. arXiv:2304.04161  [pdf

    eess.IV cs.CV

    Detection of COVID19 in Chest X-Ray Images Using Transfer Learning

    Authors: Zanoby N. Khan

    Abstract: COVID19 is a highly contagious disease infected millions of people worldwide. With limited testing components, screening tools such as chest radiography can assist the clinicians in the diagnosis and assessing the progress of disease. The performance of deep learning-based systems for diagnosis of COVID-19 disease in radiograph images has been encouraging. This paper investigates the concept of tr… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

  24. arXiv:2304.03561  [pdf

    cs.IT

    Diversity Preserving, Universal Hard Decision Decoder for Linear Block Codes

    Authors: Praveen Sai Bere, Mohammed Zafar Ali Khan

    Abstract: Hard-decision decoding does not preserve the diversity order. This results in severe performance degradation in fading channels. In contrast, soft-decision decoding preserves the diversity order at an impractical computational complexity. For a linear block code $\mathscr{C}(n,k)$ of length $n$ and dimension $k$, the complexity of soft-decision decoding is of the order of $2^k$. This paper pro… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: Transacton of 10 pages with 4 figures

  25. arXiv:2303.12210  [pdf, ps, other

    stat.ML cs.LG

    A Random Projection k Nearest Neighbours Ensemble for Classification via Extended Neighbourhood Rule

    Authors: Amjad Ali, Muhammad Hamraz, Dost Muhammad Khan, Wajdan Deebani, Zardad Khan

    Abstract: Ensembles based on k nearest neighbours (kNN) combine a large number of base learners, each constructed on a sample taken from a given training data. Typical kNN based ensembles determine the k closest observations in the training data bounded to a test sample point by a spherical region to predict its class. In this paper, a novel random projection extended neighbourhood rule (RPExNRule) ensemble… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: 23 pages, 8 diagrams, 69 references

    ACM Class: F.2.2

  26. arXiv:2303.11866  [pdf, other

    cs.CV

    Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning

    Authors: Zaid Khan, Yun Fu

    Abstract: Contrastive vision-language models (e.g. CLIP) are typically created by updating all the parameters of a vision model and language model through contrastive training. Can such models be created by a small number of parameter updates to an already-trained language model and vision model? The literature describes techniques that can create vision-language models by updating a small number of paramet… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepted to ICLR 2023

  27. TAU: A Framework for Video-Based Traffic Analytics Leveraging Artificial Intelligence and Unmanned Aerial Systems

    Authors: Bilel Benjdira, Anis Koubaa, Ahmad Taher Azar, Zahid Khan, Adel Ammar, Wadii Boulila

    Abstract: Smart traffic engineering and intelligent transportation services are in increasing demand from governmental authorities to optimize traffic performance and thus reduce energy costs, increase the drivers' safety and comfort, ensure traffic laws enforcement, and detect traffic violations. In this paper, we address this challenge, and we leverage the use of Artificial Intelligence (AI) and Unmanned… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: This is the final proofread version submitted to Elsevier EAAI: please see the published version at: https://doi.org/10.1016/j.engappai.2022.105095

    Journal ref: Engineering Applications of Artificial Intelligence, Volume 114, 2022, 105095, ISSN 0952-1976

  28. arXiv:2302.10978  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Learning to Retrieve Engaging Follow-Up Queries

    Authors: Christopher Richardson, Sudipta Kar, Anjishnu Kumar, Anand Ramachandran, Omar Zia Khan, Zeynab Raeesy, Abhinav Sethy

    Abstract: Open domain conversational agents can answer a broad range of targeted queries. However, the sequential nature of interaction with these systems makes knowledge exploration a lengthy task which burdens the user with asking a chain of well phrased questions. In this paper, we present a retrieval based system and associated dataset for predicting the next questions that the user might have. Such a s… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: EACL 2023

  29. arXiv:2212.02291  [pdf, other

    cs.CV

    I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification

    Authors: Muhammad Ferjad Naeem, Muhammad Gul Zain Ali Khan, Yongqin Xian, Muhammad Zeshan Afzal, Didier Stricker, Luc Van Gool, Federico Tombari

    Abstract: Recent works have shown that unstructured text (documents) from online sources can serve as useful auxiliary information for zero-shot image classification. However, these methods require access to a high-quality source like Wikipedia and are limited to a single source of information. Large Language Models (LLM) trained on web-scale text show impressive abilities to repurpose their learned knowled… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

  30. arXiv:2211.11278  [pdf, ps, other

    stat.ML cs.LG

    Optimal Extended Neighbourhood Rule $k$ Nearest Neighbours Ensemble

    Authors: Amjad Ali, Zardad Khan, Dost Muhammad Khan, Saeed Aldahmani

    Abstract: The traditional k nearest neighbor (kNN) approach uses a distance formula within a spherical region to determine the k closest training observations to a test sample point. However, this approach may not work well when test point is located outside this region. Moreover, aggregating many base kNN learners can result in poor ensemble performance due to high classification errors. To address these i… ▽ More

    Submitted 15 February, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: This manuscript has been submitted for publication in the esteemed journal Pattern Recognition Letters

    MSC Class: 14J60

  31. arXiv:2210.11557  [pdf, other

    cs.CV

    Learning Attention Propagation for Compositional Zero-Shot Learning

    Authors: Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc Van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal

    Abstract: Compositional zero-shot learning aims to recognize unseen compositions of seen visual primitives of object classes and their states. While all primitives (states and objects) are observable during training in some combination, their complex interaction makes this task especially hard. For example, wet changes the visual appearance of a dog very differently from a bicycle. Furthermore, we argue tha… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

  32. arXiv:2210.10828  [pdf, other

    cs.CV

    Grounded Video Situation Recognition

    Authors: Zeeshan Khan, C. V. Jawahar, Makarand Tapaswi

    Abstract: Dense video understanding requires answering several questions such as who is doing what to whom, with what, how, why, and where. Recently, Video Situation Recognition (VidSitu) is framed as a task for structured prediction of multiple events, their relationships, and actions and various verb-role pairs attached to descriptive entities. This task poses several challenges in identifying, disambigua… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022. Project Page: https://zeeshank95.github.io/grvidsitu

  33. arXiv:2210.04429  [pdf, other

    eess.IV cs.CV

    DeepHS-HDRVideo: Deep High Speed High Dynamic Range Video Reconstruction

    Authors: Zeeshan Khan, Parth Shettiwar, Mukul Khanna, Shanmuganathan Raman

    Abstract: Due to hardware constraints, standard off-the-shelf digital cameras suffers from low dynamic range (LDR) and low frame per second (FPS) outputs. Previous works in high dynamic range (HDR) video reconstruction uses sequence of alternating exposure LDR frames as input, and align the neighbouring frames using optical flow based networks. However, these methods often result in motion artifacts in chal… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: ICPR 2022

  34. Gastrointestinal Disorder Detection with a Transformer Based Approach

    Authors: A. K. M. Salman Hosain, Mynul islam, Md Humaion Kabir Mehedi, Irteza Enan Kabir, Zarin Tasnim Khan

    Abstract: Accurate disease categorization using endoscopic images is a significant problem in Gastroenterology. This paper describes a technique for assisting medical diagnosis procedures and identifying gastrointestinal tract disorders based on the categorization of characteristics taken from endoscopic pictures using a vision transformer and transfer learning model. Vision transformer has shown very promi… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

  35. arXiv:2206.12481  [pdf, other

    cs.LG

    Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions

    Authors: Zulqarnain Khan, Davin Hill, Aria Masoomi, Joshua Bone, Jennifer Dy

    Abstract: Machine learning methods have significantly improved in their predictive capabilities, but at the same time they are becoming more complex and less transparent. As a result, explainers are often relied on to provide interpretability to these black-box prediction models. As crucial diagnostics tools, it is important that these explainers themselves are robust. In this paper we focus on one particul… ▽ More

    Submitted 16 April, 2024; v1 submitted 24 June, 2022; originally announced June 2022.

  36. arXiv:2205.15111  [pdf, ps, other

    cs.LG

    A k nearest neighbours classifiers ensemble based on extended neighbourhood rule and features subsets

    Authors: Amjad Ali, Muhammad Hamraz, Naz Gul, Dost Muhammad Khan, Zardad Khan, Saeed Aldahmani

    Abstract: kNN based ensemble methods minimise the effect of outliers by identifying a set of data points in the given feature space that are nearest to an unseen observation in order to predict its response by using majority voting. The ordinary ensembles based on kNN find out the k nearest observations in a region (bounded by a sphere) based on a predefined value of k. This scenario, however, might not wor… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

    Comments: This paper is submitted to pattern recognotion and has 26 pages, 9 figures and 5 tables

  37. arXiv:2205.01225  [pdf

    cs.CR cs.CV

    A Hybrid Defense Method against Adversarial Attacks on Traffic Sign Classifiers in Autonomous Vehicles

    Authors: Zadid Khan, Mashrur Chowdhury, Sakib Mahmud Khan

    Abstract: Adversarial attacks can make deep neural network (DNN) models predict incorrect output labels, such as misclassified traffic signs, for autonomous vehicle (AV) perception modules. Resilience against adversarial attacks can help AVs navigate safely on the road by avoiding misclassication of signs or objects. This DNN-based study develops a resilient traffic sign classifier for AVs that uses a hybri… ▽ More

    Submitted 24 April, 2022; originally announced May 2022.

    Comments: 13 pages, 8 figures

  38. arXiv:2203.14395  [pdf, other

    cs.CV

    Single-Stream Multi-Level Alignment for Vision-Language Pretraining

    Authors: Zaid Khan, Vijay Kumar BG, Xiang Yu, Samuel Schulter, Manmohan Chandraker, Yun Fu

    Abstract: Self-supervised vision-language pretraining from pure images and text with a contrastive loss is effective, but ignores fine-grained alignment due to a dual-stream architecture that aligns image and text representations only on a global level. Earlier, supervised, non-contrastive methods were capable of finer-grained alignment, but required dense annotations that were not scalable. We propose a si… ▽ More

    Submitted 27 July, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

    Comments: ECCV 2022

  39. POSTER: Diagnosis of COVID-19 through Transfer Learning Techniques on CT Scans: A Comparison of Deep Learning Models

    Authors: Aeyan Ashraf, Asad Malik, Zahid Khan

    Abstract: The novel coronavirus disease (COVID-19) constitutes a public health emergency globally. It is a deadly disease which has infected more than 230 million people worldwide. Therefore, early and unswerving detection of COVID-19 is necessary. Evidence of this virus is most commonly being tested by RT-PCR test. This test is not 100% reliable as it is known to give false positives and false negatives. O… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Journal ref: 2022 2nd International Conference of Smart Systems and Emerging Technologies (SMARTTECH)

  40. A Neural Network based Framework for Effective Laparoscopic Video Quality Assessment

    Authors: Zohaib Amjad Khan, Azeddine Beghdadi, Mounir Kaaniche, Faouzi Alaya Cheikh, Osama Gharbi

    Abstract: Video quality assessment is a challenging problem having a critical significance in the context of medical imaging. For instance, in laparoscopic surgery, the acquired video data suffers from different kinds of distortion that not only hinder surgery performance but also affect the execution of subsequent tasks in surgical navigation and robotic surgeries. For this reason, we propose in this paper… ▽ More

    Submitted 14 April, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

  41. arXiv:2201.06180  [pdf, other

    eess.SY cs.AI math.OC

    Nonlinear Control Allocation: A Learning Based Approach

    Authors: Hafiz Zeeshan Iqbal Khan, Surrayya Mobeen, Jahanzeb Rajput, Jamshed Riaz

    Abstract: Modern aircraft are designed with redundant control effectors to cater for fault tolerance and maneuverability requirements. This leads to aircraft being over-actuated and requires control allocation schemes to distribute the control commands among control effectors. Traditionally, optimization-based control allocation schemes are used; however, for nonlinear allocation problems, these methods req… ▽ More

    Submitted 27 March, 2024; v1 submitted 16 January, 2022; originally announced January 2022.

    Comments: submitted to IEEE Conference on Decision and Control (CDC), 2024

  42. arXiv:2112.09296  [pdf, other

    cs.CR

    A Survey on the Applications of Blockchains in Security of IoT Systems

    Authors: Zulfiqar Ali Khan, Akbar Siami Namin

    Abstract: The Internet of Things (IoT) has already changed our daily lives by integrating smart devices together towards delivering high quality services to its clients. These devices when integrated together form a network through which massive amount of data can be produced, transferred, and shared. A critical concern is the security and integrity of such a complex platform to ensure the sustainability an… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: 15 pages, IEEE Bigdata 2021

  43. New Lower Bounds on the Capacity of Optical Fiber Channels via Optimized Shaping and Detection

    Authors: Marco Secondini, Stella Civelli, Enrico Forestieri, Lareb Zar Khan

    Abstract: Constellation shaping is a practical and effective technique to improve the performance and the rate adaptivity of optical communication systems. In principle, it could also be used to mitigate the impact of nonlinear effects, possibly increasing the information rate beyond the current limit dictated by fiber nonlinearity. However, this appealing idea is frustrated by the difficulty of designing a… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: Submitted to IEEE Journal of Lightwave Technology on November 30th, 2021

  44. arXiv:2110.07467  [pdf

    cs.LG cs.AI cs.ET

    Hybrid Quantum-Classical Neural Network for Cloud-supported In-Vehicle Cyberattack Detection

    Authors: Mhafuzul Islam, Mashrur Chowdhury, Zadid Khan, Sakib Mahmud Khan

    Abstract: A classical computer works with ones and zeros, whereas a quantum computer uses ones, zeros, and superpositions of ones and zeros, which enables quantum computers to perform a vast number of calculations simultaneously compared to classical computers. In a cloud-supported cyber-physical system environment, running a machine learning application in quantum computers is often difficult, due to the e… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: 4 pages, 3 figures

  45. Airfoil's Aerodynamic Coefficients Prediction using Artificial Neural Network

    Authors: Hassan Moin, Hafiz Zeeshan Iqbal Khan, Surrayya Mobeen, Jamshed Riaz

    Abstract: Figuring out the right airfoil is a crucial step in the preliminary stage of any aerial vehicle design, as its shape directly affects the overall aerodynamic characteristics of the aircraft or rotorcraft. Besides being a measure of performance, the aerodynamic coefficients are used to design additional subsystems such as a flight control system, or predict complex dynamic phenomena such as aeroela… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Journal ref: 2022 19th International Bhurban Conference on Applied Sciences and Technology (IBCAST)

  46. arXiv:2108.05781  [pdf, other

    cs.NI

    Networked Twins and Twins of Networks: an Overview on the Relationship Between Digital Twins and 6G

    Authors: Hamed Ahmadi, Avishek Nag, Zaheer Khan, Kamran Sayrafian, Susanto Rahadrja

    Abstract: Digital Twin (DT) is a promising technology for the new immersive digital life with a variety of applications in areas such as Industry 4.0, aviation, and healthcare. Proliferation of this technology requires higher data rates, reliability, resilience, and lower latency beyond what is currently offered by 5G. Thus, DT can become a major driver for 6G research and development. Alternatively, 6G net… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: Accepted for publication at IEEE Communications Standards Magazine

  47. Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation

    Authors: Zaid Khan, Yun Fu

    Abstract: Multimodal target/aspect sentiment classification combines multimodal sentiment analysis and aspect/target sentiment classification. The goal of the task is to combine vision and language to understand the sentiment towards a target entity in a sentence. Twitter is an ideal setting for the task because it is inherently multimodal, highly emotional, and affects real world events. However, multimoda… ▽ More

    Submitted 5 August, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: ACM Multimedia 2021 Oral

  48. arXiv:2108.01127  [pdf

    cs.LG eess.SY quant-ph

    Hybrid Quantum-Classical Neural Network for Incident Detection

    Authors: Zadid Khan, Sakib Mahmud Khan, Jean Michel Tine, Ayse Turhan Comert, Diamon Rice, Gurcan Comert, Dimitra Michalaka, Judith Mwakalonge, Reek Majumdar, Mashrur Chowdhury

    Abstract: The efficiency and reliability of real-time incident detection models directly impact the affected corridors' traffic safety and operational conditions. The recent emergence of cloud-based quantum computing infrastructure and innovations in noisy intermediate-scale quantum devices have revealed a new era of quantum-enhanced algorithms that can be leveraged to improve real-time incident detection a… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 14 pages, 10 figures

  49. arXiv:2108.01125  [pdf

    quant-ph cs.CR cs.LG

    Hybrid Classical-Quantum Deep Learning Models for Autonomous Vehicle Traffic Image Classification Under Adversarial Attack

    Authors: Reek Majumder, Sakib Mahmud Khan, Fahim Ahmed, Zadid Khan, Frank Ngeni, Gurcan Comert, Judith Mwakalonge, Dimitra Michalaka, Mashrur Chowdhury

    Abstract: Image classification must work for autonomous vehicles (AV) operating on public roads, and actions performed based on image misclassification can have serious consequences. Traffic sign images can be misclassified by an adversarial attack on machine learning models used by AVs for traffic sign recognition. To make classification models resilient against adversarial attacks, we used a hybrid deep-l… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 16 pages, 7 figures

  50. arXiv:2107.09622  [pdf, other

    cs.CL

    More Parameters? No Thanks!

    Authors: Zeeshan Khan, Kartheek Akella, Vinay P. Namboodiri, C V Jawahar

    Abstract: This work studies the long-standing problems of model capacity and negative interference in multilingual neural machine translation MNMT. We use network pruning techniques and observe that pruning 50-70% of the parameters from a trained MNMT model results only in a 0.29-1.98 drop in the BLEU score. Suggesting that there exist large redundancies even in MNMT models. These observations motivate us t… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.