-
A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities
Authors:
Ibrahim Ethem Hamamci,
Sezgin Er,
Furkan Almas,
Ayse Gulnihan Simsek,
Sevval Nil Esirgun,
Irem Dogan,
Muhammed Furkan Dasdelen,
Bastian Wittmann,
Enis Simsar,
Mehmet Simsar,
Emine Bensu Erdemir,
Abdullah Alanbay,
Anjany Sekuboyina,
Berkan Lafci,
Mehmet K. Ozdemir,
Bjoern Menze
Abstract:
A major challenge in computational research in 3D medical imaging is the lack of comprehensive datasets. Addressing this issue, our study introduces CT-RATE, the first 3D medical imaging dataset that pairs images with textual reports. CT-RATE consists of 25,692 non-contrast chest CT volumes, expanded to 50,188 through various reconstructions, from 21,304 unique patients, along with corresponding r…
▽ More
A major challenge in computational research in 3D medical imaging is the lack of comprehensive datasets. Addressing this issue, our study introduces CT-RATE, the first 3D medical imaging dataset that pairs images with textual reports. CT-RATE consists of 25,692 non-contrast chest CT volumes, expanded to 50,188 through various reconstructions, from 21,304 unique patients, along with corresponding radiology text reports. Leveraging CT-RATE, we developed CT-CLIP, a CT-focused contrastive language-image pre-training framework. As a versatile, self-supervised model, CT-CLIP is designed for broad application and does not require task-specific training. Remarkably, CT-CLIP outperforms state-of-the-art, fully supervised methods in multi-abnormality detection across all key metrics, thus eliminating the need for manual annotation. We also demonstrate its utility in case retrieval, whether using imagery or textual queries, thereby advancing knowledge dissemination. The open-source release of CT-RATE and CT-CLIP marks a significant advancement in medical AI, enhancing 3D imaging analysis and fostering innovation in healthcare.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden Rewards
Authors:
Ilgin Dogan,
Zuo-Jun Max Shen,
Anil Aswani
Abstract:
In practice, incentive providers (i.e., principals) often cannot observe the reward realizations of incentivized agents, which is in contrast to many principal-agent models that have been previously studied. This information asymmetry challenges the principal to consistently estimate the agent's unknown rewards by solely watching the agent's decisions, which becomes even more challenging when the…
▽ More
In practice, incentive providers (i.e., principals) often cannot observe the reward realizations of incentivized agents, which is in contrast to many principal-agent models that have been previously studied. This information asymmetry challenges the principal to consistently estimate the agent's unknown rewards by solely watching the agent's decisions, which becomes even more challenging when the agent has to learn its own rewards. This complex setting is observed in various real-life scenarios ranging from renewable energy storage contracts to personalized healthcare incentives. Hence, it offers not only interesting theoretical questions but also wide practical relevance. This paper explores a repeated adverse selection game between a self-interested learning agent and a learning principal. The agent tackles a multi-armed bandit (MAB) problem to maximize their expected reward plus incentive. On top of the agent's learning, the principal trains a parallel algorithm and faces a trade-off between consistently estimating the agent's unknown rewards and maximizing their own utility by offering adaptive incentives to lead the agent. For a non-parametric model, we introduce an estimator whose only input is the history of principal's incentives and agent's choices. We unite this estimator with a proposed data-driven incentive policy within a MAB framework. Without restricting the type of the agent's algorithm, we prove finite-sample consistency of the estimator and a rigorous regret bound for the principal by considering the sequential externality imposed by the agent. Lastly, our theoretical results are reinforced by simulations justifying applicability of our framework to green energy aggregator contracts.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
Authors:
Ibrahim Ethem Hamamci,
Sezgin Er,
Anjany Sekuboyina,
Enis Simsar,
Alperen Tezcan,
Ayse Gulnihan Simsek,
Sevval Nil Esirgun,
Furkan Almas,
Irem Dogan,
Muhammed Furkan Dasdelen,
Chinmay Prabhakar,
Hadrien Reynaud,
Sarthak Pati,
Christian Bluethgen,
Mehmet Kemal Ozdemir,
Bjoern Menze
Abstract:
GenerateCT, the first approach to generating 3D medical imaging conditioned on free-form medical text prompts, incorporates a text encoder and three key components: a novel causal vision transformer for encoding 3D CT volumes, a text-image transformer for aligning CT and text tokens, and a text-conditional super-resolution diffusion model. Without directly comparable methods in 3D medical imaging,…
▽ More
GenerateCT, the first approach to generating 3D medical imaging conditioned on free-form medical text prompts, incorporates a text encoder and three key components: a novel causal vision transformer for encoding 3D CT volumes, a text-image transformer for aligning CT and text tokens, and a text-conditional super-resolution diffusion model. Without directly comparable methods in 3D medical imaging, we benchmarked GenerateCT against cutting-edge methods, demonstrating its superiority across all key metrics. Importantly, we evaluated GenerateCT's clinical applications in a multi-abnormality classification task. First, we established a baseline by training a multi-abnormality classifier on our real dataset. To further assess the model's generalization to external data and performance with unseen prompts in a zero-shot scenario, we employed an external set to train the classifier, setting an additional benchmark. We conducted two experiments in which we doubled the training datasets by synthesizing an equal number of volumes for each set using GenerateCT. The first experiment demonstrated an 11% improvement in the AP score when training the classifier jointly on real and generated volumes. The second experiment showed a 7% improvement when training on both real and generated volumes based on unseen prompts. Moreover, GenerateCT enables the scaling of synthetic training datasets to arbitrary sizes. As an example, we generated 100,000 3D CTs, fivefold the number in our real set, and trained the classifier exclusively on these synthetic CTs. Impressively, this classifier surpassed the performance of the one trained on all available real data by a margin of 8%. Last, domain experts evaluated the generated volumes, confirming a high degree of alignment with the text prompt. Access our code, model weights, training data, and generated data at https://github.com/ibrahimethemhamamci/GenerateCT
△ Less
Submitted 12 July, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Repeated Principal-Agent Games with Unobserved Agent Rewards and Perfect-Knowledge Agents
Authors:
Ilgin Dogan,
Zuo-Jun Max Shen,
Anil Aswani
Abstract:
Motivated by a number of real-world applications from domains like healthcare and sustainable transportation, in this paper we study a scenario of repeated principal-agent games within a multi-armed bandit (MAB) framework, where: the principal gives a different incentive for each bandit arm, the agent picks a bandit arm to maximize its own expected reward plus incentive, and the principal observes…
▽ More
Motivated by a number of real-world applications from domains like healthcare and sustainable transportation, in this paper we study a scenario of repeated principal-agent games within a multi-armed bandit (MAB) framework, where: the principal gives a different incentive for each bandit arm, the agent picks a bandit arm to maximize its own expected reward plus incentive, and the principal observes which arm is chosen and receives a reward (different than that of the agent) for the chosen arm. Designing policies for the principal is challenging because the principal cannot directly observe the reward that the agent receives for their chosen actions, and so the principal cannot directly learn the expected reward using existing estimation techniques. As a result, the problem of designing policies for this scenario, as well as similar ones, remains mostly unexplored. In this paper, we construct a policy that achieves a low regret (i.e., square-root regret up to a log factor) in this scenario for the case where the agent has perfect-knowledge about its own expected rewards for each bandit arm. We design our policy by first constructing an estimator for the agent's expected reward for each bandit arm. Since our estimator uses as data the sequence of incentives offered and subsequently chosen arms, the principal's estimation can be regarded as an analogy of online inverse optimization in MAB's. Next we construct a policy that we prove achieves a low regret by deriving finite-sample concentration bounds for our estimator. We conclude with numerical simulations demonstrating the applicability of our policy to real-life setting from collaborative transportation planning.
△ Less
Submitted 7 May, 2023; v1 submitted 14 April, 2023;
originally announced April 2023.
-
Assigning Species Information to Corresponding Genes by a Sequence Labeling Framework
Authors:
Ling Luo,
Chih-Hsuan Wei,
Po-Ting Lai,
Qingyu Chen,
Rezarta Islamaj Doğan,
Zhiyong Lu
Abstract:
The automatic assignment of species information to the corresponding genes in a research article is a critically important step in the gene normalization task, whereby a gene mention is normalized and linked to a database record or identifier by a text-mining algorithm. Existing methods typically rely on heuristic rules based on gene and species co-occurrence in the article, but their accuracy is…
▽ More
The automatic assignment of species information to the corresponding genes in a research article is a critically important step in the gene normalization task, whereby a gene mention is normalized and linked to a database record or identifier by a text-mining algorithm. Existing methods typically rely on heuristic rules based on gene and species co-occurrence in the article, but their accuracy is suboptimal. We therefore developed a high-performance method, using a novel deep learning-based framework, to classify whether there is a relation between a gene and a species. Instead of the traditional binary classification framework in which all possible pairs of genes and species in the same article are evaluated, we treat the problem as a sequence-labeling task such that only a fraction of the pairs needs to be considered. Our benchmarking results show that our approach obtains significantly higher performance compared to that of the rule-based baseline method for the species assignment task (from 65.8% to 81.3% in accuracy). The source code and data for species assignment are freely available at https://github.com/ncbi/SpeciesAssignment.
△ Less
Submitted 8 May, 2022;
originally announced May 2022.
-
Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations
Authors:
Qingyu Chen,
Alexis Allot,
Robert Leaman,
Rezarta Islamaj Doğan,
Jingcheng Du,
Li Fang,
Kai Wang,
Shuo Xu,
Yuefu Zhang,
Parsa Bagherzadeh,
Sabine Bergler,
Aakash Bhatnagar,
Nidhir Bhavsar,
Yung-Chun Chang,
Sheng-Jie Lin,
Wentai Tang,
Hongtong Zhang,
Ilija Tavchioski,
Senja Pollak,
Shubo Tian,
Jinfeng Zhang,
Yulia Otmakhova,
Antonio Jimeno Yepes,
Hang Dong,
Honghan Wu
, et al. (14 additional authors not shown)
Abstract:
The COVID-19 pandemic has been severely impacting global society since December 2019. Massive research has been undertaken to understand the characteristics of the virus and design vaccines and drugs. The related findings have been reported in biomedical literature at a rate of about 10,000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretatio…
▽ More
The COVID-19 pandemic has been severely impacting global society since December 2019. Massive research has been undertaken to understand the characteristics of the virus and design vaccines and drugs. The related findings have been reported in biomedical literature at a rate of about 10,000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200,000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g., Diagnosis and Treatment) to the articles in LitCovid. Despite the continuing advances in biomedical text mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset, consisting of over 30,000 articles with manually reviewed topics, was created for training and testing. It is one of the largest multilabel classification datasets in biomedical scientific literature. 19 teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181, and 0.9394 for macro F1-score, micro F1-score, and instance-based F1-score, respectively. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development.
△ Less
Submitted 3 June, 2022; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Interactive Disambiguation for Behavior Tree Execution
Authors:
Matteo Iovino,
Fethiye Irmak Doğan,
Iolanda Leite,
Christian Smith
Abstract:
In recent years, robots are used in an increasing variety of tasks, especially by small- and medium- sized enterprises. These tasks are usually fast-changing, they have a collaborative scenario and happen in unpredictable environments with possible ambiguities. It is important to have methods capable of generating robot programs easily, that are made as general as possible by handling uncertaintie…
▽ More
In recent years, robots are used in an increasing variety of tasks, especially by small- and medium- sized enterprises. These tasks are usually fast-changing, they have a collaborative scenario and happen in unpredictable environments with possible ambiguities. It is important to have methods capable of generating robot programs easily, that are made as general as possible by handling uncertainties. We present a system that integrates a method to learn Behavior Trees (BTs) from demonstration for pick and place tasks, with a framework that uses verbal interaction to ask follow-up clarification questions to resolve ambiguities. During the execution of a task, the system asks for user input when there is need to disambiguate an object in the scene, when the targets of the task are objects of a same type that are present in multiple instances. The integrated system is demonstrated on different scenarios of a pick and place task, with increasing level of ambiguities. The code used for this paper is made publicly available.
△ Less
Submitted 10 March, 2022; v1 submitted 6 March, 2022;
originally announced March 2022.
-
Regret Analysis of Learning-Based MPC with Partially-Unknown Cost Function
Authors:
Ilgin Dogan,
Zuo-Jun Max Shen,
Anil Aswani
Abstract:
The exploration/exploitation trade-off is an inherent challenge in data-driven adaptive control. Though this trade-off has been studied for multi-armed bandits (MAB's) and reinforcement learning for linear systems; it is less well-studied for learning-based control of nonlinear systems. A significant theoretical challenge in the nonlinear setting is that there is no explicit characterization of an…
▽ More
The exploration/exploitation trade-off is an inherent challenge in data-driven adaptive control. Though this trade-off has been studied for multi-armed bandits (MAB's) and reinforcement learning for linear systems; it is less well-studied for learning-based control of nonlinear systems. A significant theoretical challenge in the nonlinear setting is that there is no explicit characterization of an optimal controller for a given set of cost and system parameters. We propose the use of a finite-horizon oracle controller with full knowledge of parameters as a reasonable surrogate to optimal controller. This allows us to develop policies in the context of learning-based MPC and MAB's and conduct a control-theoretic analysis using techniques from MPC- and optimization-theory to show these policies achieve low regret with respect to this finite-horizon oracle. Our simulations exhibit the low regret of our policy on a heating, ventilation, and air-conditioning model with partially-unknown cost function.
△ Less
Submitted 27 January, 2023; v1 submitted 4 August, 2021;
originally announced August 2021.
-
Leveraging Explainability for Comprehending Referring Expressions in the Real World
Authors:
Fethiye Irmak Dogan,
Gaspar I. Melsion,
Iolanda Leite
Abstract:
For effective human-robot collaboration, it is crucial for robots to understand requests from users and ask reasonable follow-up questions when there are ambiguities. While comprehending the users' object descriptions in the requests, existing studies have focused on this challenge for limited object categories that can be detected or localized with existing object detection and localization modul…
▽ More
For effective human-robot collaboration, it is crucial for robots to understand requests from users and ask reasonable follow-up questions when there are ambiguities. While comprehending the users' object descriptions in the requests, existing studies have focused on this challenge for limited object categories that can be detected or localized with existing object detection and localization modules. On the other hand, in the wild, it is impossible to limit the object categories that can be encountered during the interaction. To understand described objects and resolve ambiguities in the wild, for the first time, we suggest a method by leveraging explainability. Our method focuses on the active regions of a scene to find the described objects without putting the previous constraints on object categories and natural language instructions. We evaluate our method in varied real-world images and observe that the regions suggested by our method can help resolve ambiguities. When we compare our method with a state-of-the-art baseline, we show that our method performs better in scenes with ambiguous objects which cannot be recognized by existing object detectors.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Using Depth for Improving Referring Expression Comprehension in Real-World Environments
Authors:
Fethiye Irmak Dogan,
Iolanda Leite
Abstract:
In a human-robot collaborative task where a robot helps its partner by finding described objects, the depth dimension plays a critical role in successful task completion. Existing studies have mostly focused on comprehending the object descriptions using RGB images. However, 3-dimensional space perception that includes depth information is fundamental in real-world environments. In this work, we p…
▽ More
In a human-robot collaborative task where a robot helps its partner by finding described objects, the depth dimension plays a critical role in successful task completion. Existing studies have mostly focused on comprehending the object descriptions using RGB images. However, 3-dimensional space perception that includes depth information is fundamental in real-world environments. In this work, we propose a method to identify the described objects considering depth dimension data. Using depth features significantly improves performance in scenes where depth data is critical to disambiguate the objects and across our whole evaluation dataset that contains objects that can be specified with and without the depth dimension.
△ Less
Submitted 9 July, 2021;
originally announced July 2021.
-
Open Challenges on Generating Referring Expressions for Human-Robot Interaction
Authors:
Fethiye Irmak Doğan,
Iolanda Leite
Abstract:
Effective verbal communication is crucial in human-robot collaboration. When a robot helps its human partner to complete a task with verbal instructions, referring expressions are commonly employed during the interaction. Despite many studies on generating referring expressions, crucial open challenges still remain for effective interaction. In this work, we discuss some of these challenges (i.e.,…
▽ More
Effective verbal communication is crucial in human-robot collaboration. When a robot helps its human partner to complete a task with verbal instructions, referring expressions are commonly employed during the interaction. Despite many studies on generating referring expressions, crucial open challenges still remain for effective interaction. In this work, we discuss some of these challenges (i.e., using contextual information, taking users' perspectives, and handling misinterpretations in an autonomous manner).
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Learning to Generate Unambiguous Spatial Referring Expressions for Real-World Environments
Authors:
Fethiye Irmak Doğan,
Sinan Kalkan,
Iolanda Leite
Abstract:
Referring to objects in a natural and unambiguous manner is crucial for effective human-robot interaction. Previous research on learning-based referring expressions has focused primarily on comprehension tasks, while generating referring expressions is still mostly limited to rule-based methods. In this work, we propose a two-stage approach that relies on deep learning for estimating spatial relat…
▽ More
Referring to objects in a natural and unambiguous manner is crucial for effective human-robot interaction. Previous research on learning-based referring expressions has focused primarily on comprehension tasks, while generating referring expressions is still mostly limited to rule-based methods. In this work, we propose a two-stage approach that relies on deep learning for estimating spatial relations to describe an object naturally and unambiguously with a referring expression. We compare our method to the state of the art algorithm in ambiguous environments (e.g., environments that include very similar objects with similar relationships). We show that our method generates referring expressions that people find to be more accurate ($\sim$30% better) and would prefer to use ($\sim$32% more often).
△ Less
Submitted 5 August, 2019; v1 submitted 15 April, 2019;
originally announced April 2019.
-
PMC text mining subset in BioC: 2.3 million full text articles and growing
Authors:
Donald C. Comeau,
Chih-Hsuan Wei,
Rezarta Islamaj Doğan,
Zhiyong Lu
Abstract:
Interest in full text mining biomedical research articles is growing. NCBI provides the PMC Open Access and Author Manuscript sets of articles which are available for text mining. We have made all of these articles available in BioC, an XML and JSON format which is convenient for sharing text, annotations, and relations. These articles are available both via ftp for bulk download and via a Web API…
▽ More
Interest in full text mining biomedical research articles is growing. NCBI provides the PMC Open Access and Author Manuscript sets of articles which are available for text mining. We have made all of these articles available in BioC, an XML and JSON format which is convenient for sharing text, annotations, and relations. These articles are available both via ftp for bulk download and via a Web API for updates or more focused collection. Availability: https://www.ncbi.nlm.nih.gov/research/bionlp/APIs/BioC-PMC/
△ Less
Submitted 16 April, 2018;
originally announced April 2018.
-
CINet: A Learning Based Approach to Incremental Context Modeling in Robots
Authors:
Fethiye Irmak Doğan,
İlker Bozcan,
Mehmet Çelik,
Sinan Kalkan
Abstract:
There have been several attempts at modeling context in robots. However, either these attempts assume a fixed number of contexts or use a rule-based approach to determine when to increment the number of contexts. In this paper, we pose the task of when to increment as a learning problem, which we solve using a Recurrent Neural Network. We show that the network successfully (with 98\% testing accur…
▽ More
There have been several attempts at modeling context in robots. However, either these attempts assume a fixed number of contexts or use a rule-based approach to determine when to increment the number of contexts. In this paper, we pose the task of when to increment as a learning problem, which we solve using a Recurrent Neural Network. We show that the network successfully (with 98\% testing accuracy) learns to predict when to increment, and demonstrate, in a scene modeling problem (where the correct number of contexts is not known), that the robot increments the number of contexts in an expected manner (i.e., the entropy of the system is reduced). We also present how the incremental model can be used for various scene reasoning tasks.
△ Less
Submitted 29 July, 2018; v1 submitted 13 October, 2017;
originally announced October 2017.
-
A Deep Incremental Boltzmann Machine for Modeling Context in Robots
Authors:
Fethiye Irmak Doğan,
Hande Çelikkanat,
Sinan Kalkan
Abstract:
Context is an essential capability for robots that are to be as adaptive as possible in challenging environments. Although there are many context modeling efforts, they assume a fixed structure and number of contexts. In this paper, we propose an incremental deep model that extends Restricted Boltzmann Machines. Our model gets one scene at a time, and gradually extends the contextual model when ne…
▽ More
Context is an essential capability for robots that are to be as adaptive as possible in challenging environments. Although there are many context modeling efforts, they assume a fixed structure and number of contexts. In this paper, we propose an incremental deep model that extends Restricted Boltzmann Machines. Our model gets one scene at a time, and gradually extends the contextual model when necessary, either by adding a new context or a new context layer to form a hierarchy. We show on a scene classification benchmark that our method converges to a good estimate of the contexts of the scenes, and performs better or on-par on several tasks compared to other incremental models or non-incremental models.
△ Less
Submitted 2 March, 2018; v1 submitted 13 October, 2017;
originally announced October 2017.